Causality vs Correlation: Granger Causality

Nice intro to co-integration

One of the most repeated mantra’s of Machine Learning is that

“A Causation is not a Correlation!”

When faced with this statement, I’m never really sure how to respond.  After all, the entire point of science is to measure correlations and other signals and determine models that explain their cause and can predict future events.

It is certainly true, however, that if we are naive, we can fool ourselves into seeing patterns that are not really there.    This is especially true in financial and econmetric time series, which do not seem to follow any of the simple laws of statistics.   In our continuing studies of noisy time series, we do not seek to address “the fundamental philosophical and epistemological question of  real causality,” [5] but, rather,

We seek practical methods that can detect a weak signal  in noisy time seriesand model the underlying ’cause’ 

Science: the Search…

View original post 1,103 more words


the inherent danger of working with Git

Writing and correcting scientific papers is tedious, losing progress accidentally is even worse. The same applies to developing scientific computing programs. Version control is a must. Naively, system-wide backup program such as Time Machine can finish the job, but it is a coarse-grained version control. Dropbox sync can also be used as an ad hoc version control system, but it needs constant human intervention. As an example, at some point you decide to create a new version of file.tex, what people usually do: 1) rename the old version of file.tex as file_v1.tex, 2) duplicate file_v1.tex, rename it file_v2.tex and start editing. Before long your dropbox will be flooded with different versions of the same file, to a point that you have to delete older versions to save space. But which version(s) of file.tex should you delete? … Losing progress is inevitable! You should never ever make that decision!

While you are working with on your manuscript and save regularly, ALL your collaborators will probably get a notification for EVERY SINGLE FUCKING SAVE you do along the way. Someone once suggested to me that we should be working on our own local directory and move the files back to Dropbox directory after the modifications have been done, which is a perfect receipt for creating tons of new conflicts.

This is where Git comes in. If PROPERLY used, Git is the final solution. … mind the adv … Git has its own complexity and some innocent-looking commands, such as rebase and checkout, can cause great damages. If you think using Git you will never lose any progress, you are not likely to guard against doing stupid things with Git like I did. The last time, I lost changes to a manuscript thanks to git rebase. This time, I lost one day’s worth of tedious figures editing & papers correcting due to git checkout. Last time, I recovered the changes from my Time Machine backup. This time, since my Time Machine disk is not around, I have no way to recover. What prompted me to git checkout? I will rant about it next time.