Setup Virtual Environment in Python

In this post, I’ll demonstrate how to set up a virtual environment in python. It’s quite important to set up a virtual environment when you’ve developing different projects. The package and python version may be different according to the practical request. In the virtual environment, you can install what you need and it won’t influence …

Key Concept of Selecting Data from DataFrame in Python

As a data scientist or analyst, you must use python to manipulate data quite often. And using the pandas and numpy package is the popular way to do it. However, as a beginner or intermediate level python user, you must be confused about how to get the data from a DataFrame and try to google …

How to Improve HiveQL Efficiency

The efficiency of data queries influences the user experience extremely. When you’re creating a report by loading the data from the database, you must feel frustrated if it is very slow. I know that feeling and it is the daily life of analytics. In this post, I’ll summarize four simple methods to improve HiveQL efficiency. …