For me, the largest nuisance is that figures are stored as very long strings in the output of cells. This makes it almost impossible to manually resolve conflicts while merging two branches. The solution is simple: just clear the output of the cells before committing the notebook. I'm doing this with a simple script (which I found somewhere on the Internet).
The script
clear_output_ipynb.py
lives in the same folder (called notebooks
) as my Jupyter notebooks. I don't track changes in the .ipynb
files, but have "clean" copies of the notebooks
(with extension .ipynb.cln
) that are part of the Git repository.
To make life easy, I have two makefiles in my project folder called cln.makefile
and
nbs.makefile
. Before I stage the changes in my notebooks, I first run
$ make -f cln.makefilewhich runs the script
clear_output_ipynb.py
for each notebook in my notebooks
folder.
After I pull changes from a remote repository, or switch to another branch, I have to copy all
.ipynb.cln
files to .ipynb
files. For this I have another makefile,
and so I run
$ make -f nbs.makefilebefore using and modifying the notebooks.
Of course, sometimes I forget to clean the notebooks before committing, or I forget to make the
.ipynb
files. I've tried to automate the process of cleaning and
copying with Git "hooks", but I have not been able to make that work. If somebody knows how, let me know!
No comments:
Post a Comment