Python-ize your research


I agree with David Veenman that Stata is great software. However, Stata progresses slowly. Every now and then there is an update, but it is kinda top-down. Bottom-up-wise you can use ado files. However, if you try to write an ado file yourself you are bound to get stuck: Stata's programming language is a real challenge.

More of a pain is Stata's approach to data. It is arcane: one can only have one set of data in memory; running scripts reminds me of GWBasic, it is hardly programming. This then leads to code maintenance problems.

I tried R as an alternative to Stata - yes, the code is versatile, but there are too many modules that do similar things: data-table, data-frame, matrices. This leads to messiness too.

Both R and Stata may have been functional in the past, when dedicated programmers could contribute to the development of these packages, and when only Universities would pay for the software - thus restricting access.

Python? Today, however, great software is free, and thousands of people are happy to contribute to new features. Here is where Python has a clear advantage, it is bottom-up but without the chaos that characterises R. There is a large Python support community that helps you move on quickly, specifically via Stackoverflow.

And it must be said: there is a lot going for Python:

Pyhton is free.
Pyhton is has a great research packages, specifically SciPy, Pandas, Numpy, Statsmodels. So, you can do Fama-MacBeth and 2-way clustering in Python.
Pandas for Python seems to be written for the financial industry, e.g. there is a stash-load of date functions. Financial functions are basically all the ones that Excel offers.
The code is nicely structured, code maintenance is easy.
Pyhton is much more programming and little scripting, which allows you to produce clean code with your own functions that take care of routine tasks.
There is a Datastream package for Python.
Updating Python (via Anaconda) is seamless.
Graphs in Python are mouth-watering.
Stackoverflow offers lots of support.
WRDS supports Python.
Teaching Python can be done using PythonAnywhere.

All in all, given the world we live in today, Python looks promising: it offers maximum freedom to analyse data the way you want.

Share this: