In most of our Data Science projects, we do import at the beginning of the project that goes something like this 😀.
Not exactly, but something like this
import pandas as pd import matplotlib.pyplot as plt import seaborn as sns import numpy as np import sklearn from sklearn.preprocessing import OneHotEncoder from sklearn.manifold import TSNE from sklearn.cross_validation import train_test_split from sklearn.ensemble import GradientBoostingClassifier import sys import os import re import glob from pathlib import Path import pickle import datetime as dt
The list goes on and on if you are dealing with complex projects. we want to use Data Science libraries which accounts for >99% of our daily imports. For example,
sklearn, datetime as dt, and many more. In addition, there are also helper modules like
pathlib. So let's learn and explore about a project called pyforest. It is almost python equivalent of code
but way smarter than this.
Pyforest lazy-imports all popular Python Data Science libraries so that they are always there in your work environment when you need them this is also known as lazy import in python. If you don't use a library or any of its method or function, it won't be imported. When you are done with your script, you can export the Python code for the import statements for later use.
Check out this demo I’ve taken from the library’s GitHub repository:
You need Python 3.6 or higher to run this package.
From the terminal, enter:
pip install pyforest
And you're ready to go.
Please note, that this will also add pyforest to your IPython default startup settings. pyforest will be automatically added in your environment during startup
from pyforest import *
And if you use Jupyter or IPython, you can even skip this line because pyforest will be itself added to the autostart.
When you are done with your script, you can export all import statements via:
You can see an overview of all available lazy imports if you just type
lazy_imports() in Python.
If you are missing an import, you can add it to the pyforest imports by following contribution guidelines.
Excited yet? pyforest currently includes pandas, NumPy, matplotlib, and many more data science libraries and it's regularly managed by developers.
Please visit official github page for contribution and support
We've also provided video lecture for the same