Uncategorized

How to Use Generators in Python and 3 Use-cases That Simplify Your Code

What will you learn?

What is a Generator in Python and how to use them to work with large datasets in a Pythonic fashion.

What is a Generator?

A Generator is a function that returns a lazy iterator. Said, differently, you can iterate over the iterator, but it is lazy, that is, it will first execute the code when iterated.

A simple example could be as follows.

def my_generator():
    # Do something
    yield 5
    # Do something more
    yield 8
    # Do something else
    yield 12

Then you can iterate over the generator as follows.

for item in my_generator():
    print(item)

This will print 5, 8, and 12.

At first sight, this doesn’t look very useful. But let’s undestand it a bit better what happens.

When we make the first iteration in the for-loop, then it will execute the code in the my_generator function until it reaches the first yield.

Then it stops and returns the value after yield.

In the next iteration, it will continue where it left off and execute until it reaches the next yield.

Then it stops and returns the value after yield.

And so forth until no more yield statements are there.

Now why is that powerful?

Let’s explore some use-cases.

#1 Pre-processing a work item

If you have a pipeline of work items, where there is a pre-processing step. Often you would combine the pre-processing together with the actual processing. But actually, it will make your code more readable and maintainable if you divide it up.

Explore the example.

def pre_process_items():
    for row in open('data.txt'):
        row = row.strip()
        freq = {c: row.count(c) for c in set(row)}
        yield freq

freq = {}
for item in process_items():
    for k, v in item.items():
        freq[k] = freq.get(k, 0) + v

In this case you prepare the work item in pre_process_items().

If you want to learn about the Dict Comprehension read this guide.

This way you divide your code into a piece that prepares data and another one where you process the data. This makes the code easier to understand.

#2 Filtering work items

Often you have a list of work possible work items that need to be processed, but only a few of them actually need to be processed.

A simple example is processing a Log-file, where we are only interested in a specific log-level.

def get_warnings(log_file):
    for row in open(log_file):
        if 'WARNING' in row:
            yield row

for warning in get_warnings('log_file.txt'):
    print(warning)

This example shows how this simplifies how to filter.

If you want to learn more about text processing in Python read this guide.

#3 API calls

A great use-case is if you need to make an API call. This might require setup and filtering the result and possible reformatting.

import pandas_datareader as pdr
from datetime import datetime, timedelta

def get_stocks(tickers):
    d = datetime.now() - timedelta(days=7)
    for ticker in tickers:
        data = pdr.get_data_yahoo(ticker, d)
        close_price = list(data['Close'])
        yield close_price

for prices in get_stocks(['AAPL', 'TWTR']):
    print(prices)

The advantage of this, is, that it will first make the call to the API when you need the data (lazy load). Say, you have a list of 1000s of tickers, if you had to make all the calls before you can start to process, it could be a long waiting time.

With Generators you can utilize the power of lazy-loading.

Want to learn more?

If this is something you like and you want to get started with Python, then check my 8 hours FREE video course with full explanations, projects on each levels, and guided solutions.

The course is structured with the following resources to improve your learning experience.

  • 17 video lessons teaching you everything you need to know to get started with Python.
  • 34 Jupyter Notebooks with lesson code and projects.
  • A FREE 70+ pages eBook with all the learnings from the lessons.
Rune

Recent Posts

Build and Deploy an AI App

Build and Deploy an AI App with Python Flask, OpenAI API, and Google Cloud: In…

5 days ago

Building Python REST APIs with gcloud Serverless

Python REST APIs with gcloud Serverless In the fast-paced world of application development, building robust…

5 days ago

Accelerate Your Web App Development Journey with Python and Docker

App Development with Python using Docker Are you an aspiring app developer looking to level…

6 days ago

Data Science Course Made Easy: Unlocking the Path to Success

Why Value-driven Data Science is the Key to Your Success In the world of data…

2 weeks ago

15 Machine Learning Projects: From Beginner to Pro

Harnessing the Power of Project-Based Learning and Python for Machine Learning Mastery In today's data-driven…

2 weeks ago

Unlock the Power of Python: 17 Project-Based Lessons from Zero to Machine Learning

Is Python the right choice for Machine Learning? Should you learn Python for Machine Learning?…

2 weeks ago