Make - Using Make Commands

There is a lot of functionality included in the GeoAI-Retail template. Instead of writing endless documentation detailing how to find and use all these resources, we have created a file, make.bat, containing shortcuts to accomplish a whole boatload of tasks. As with most all the functionality in this template, this came out of our own needs to streamline workflows, and not have to dig all around in the template to get boring and routing tasks accomplished. The commands available in make.bat fall into three general categories, data preprocessing, data management, and environment management.

Data Preprocessing

More than anything else, GeoAI-Retail is a geographic feature engineering engine to create quantitative factors for use in machine learning modeling. GeoAI-Retail can then be used to revise the features and perform inferencing using the models created from the original data. Fortunately, for inferencing only a small amount of feature engineering needs to be performed.

> make data

The initial step of preparing the data for analysis can take a decent amount of time. The heavy lifting is performed using a script, make_data.py. While this script can be run directly, to make life easier, you can invoke the script directly using the command make data.

Data Management

Although the code can be synchronized with version control, typically GitHub, datasets can be large, and frequently do not work well with version control. As a result, the data directory is excluded from version control in the .gitignore file, and can be saved to Azure Blob Storage.

> make get_data

This is particularly useful when collaborating on a project. After retrieving a project from version control, you can retrieve the data needed for the project using this command. The data will be downloaded from Azure Blob storage using credentials saved in the .env file and automatically extracted to the ./data directory.

> make push_data

This creates a zipped archive of the entire contents of the ./data directory, and pushes it to Azure Blob storage using credentials saved in the .env file.

Environment Management

Managing the Python Conda environment is dramatically streamlined using the commands contained in make.bat. Quite honestly, this is one of the single largest motivating factors for initially creating it.

> make env

This is the most commonly used command. This command creates a Conda environment using the name set up when originally creating the project. Due to some nuances of how Conda is configured with ArcGIS Pro, you cannot simply create a new environment directly from the environment.yml. Rather, you have to clone the default ArcGIS Pro Conda environment arcgispro-py3 and update the new environment using the environment.yml file. Additionally, if you like to use the mapping widget in Jupyter Lab, there are two additional steps. Hence, all of this is consolidated into one single step.

> make env_activate

Sometimes the environment name is a little long, and sometimes you cannot recall what it is. Either way, it does not matter. This command will activate the project environment created using the command above, so you can get to work!

> make env_remove

Multiple environments for projects quickly litter your computer. Hence, once finished with an environment for a project, this makes it easier to remove the environment from the machine.