Coding in Python is sufficiently faster than coding in C++ that I do so wherever possible. Graphical interfaces and most file IO performs acceptably when implemented in Python; I use C++ only to implement the portions of algorithms that would increase application run time by an order of magnitude were they implemented in Python. Any increase in Python interpreter speed reduces the amount of C++ required to achieve acceptable performance.
The Intel x86-64 C++ compiler version 11.1 is expensive, but I've been pleased with the optimizations it delivers and its robust C++ support (I have written valid templated code that it compiles and MSVC 9 does not). Suspecting that I could significantly improve the Python interpreter's run times for some of my applications, I grabbed the Python 2.6.4 source and associate dependency sources. All of these I rebuilt with ICC 11.1 x64 targeting Core2 with MMX, SSE, SSE2, SSE3, SSSE3, and SSE4.1 enabled with /03 and global optimizations.
This resulted in ~15% run time reduction for my pure Python test application. Hoping for more improvement, I again rebuilt Python and its dependencies, this time with profile guided optimization logging enabled. I then re-ran my test cases and rebuilt again using the resulting logs. The resulting interpreter binaries execute my pure Python test case 31% faster vs the official distribution!
Labels: Optimization, Python