Google Summer of Code With DiRAC

By Biswarup Banerjee

I am a Computer Science Engineering student who is very much passionate about building web apps and creating products for people and communities.

I have previously worked/interned at multiple VC funded tech startups and this summer I started working remotely with the Data Engineering team from UW DiRAC institute under Google Summer of Code 2020!

Project Summary

My project revolved around building a Jupyter Notebook/Lab Extension. The extension can be used by the astronomy community to create and configure apache pyspark clusters just on the click of a few buttons. It can be done from within the Jupyter Environment without the need to write multiple lines of cumbersome codes. 

Fig 1: The “SparkManager” Extension.

Fig 2: The options to configure an apache pyspark cluster using SparkManager

Fig 3: “spark” variable is injected into the jupyter notebook using SparkManager

UW DiRAC Data Engineering Team

The best part of my summer project was definitely getting to meet the entire team of UW DiRAC and getting to know about the astronomy projects that DiRAC is working on, like the LSST! 

I feel lucky to have worked  with a very experienced, passionate and welcoming team of engineers and astronomers.

Also it is so fascinating when I realize that we were thousands of miles apart and yet we were working in a very collaborative manner. 

It was a great learning experience for me as I learnt a lot about the Jupyter Ecosystem and remote collaborative work. 

External Links for further reading

  1. The github repository link: https://github.com/astronomy-commons/sparkmanager
  2. How “SparkManager” works: https://github.com/astronomy-commons/sparkmanager/blob/master/docs/how_it_works.md
  3. My blog post about how my interests grew for space science: https://medium.com/@biswarupbanerjee/tryst-with-astronomy-and-space-science-b8df974cb159?source=friends_link&sk=802662559d1d0dbfa206e58b05efac86
  4. My Linkedin account: https://www.linkedin.com/in/techguybiswa/