How to Fetch CNN Breaking Tweets and Make Simple Statistics Automated with Python

What will we cover

  • We will use the tweepy library
  • Read the newest tweets from CNN Breaking
  • Make simple word statistics on the news tweets
  • See if we can learn anything from it

Preliminaries

The Code that does the magic

import tweepy

# personal details insert your key, secret, token and token_secret here
consumer_key = ""
consumer_secret = ""
access_token = ""
access_token_secret = ""

# authentication of consumer key and secret
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)

# authentication of access token and secret
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)

# Creation of the actual interface, using authentication
api = tweepy.API(auth)

# Use a dictionary to count the appearances of words
stat = {}

# Read the tweets from @cnnbrk and make the statistics
for status in tweepy.Cursor(api.user_timeline, screen_name='@cnnbrk', tweet_mode="extended").items():
    for word in status.full_text.split():
        if word in stat:
            stat[word] += 1
        else:
            stat[word] = 1

# Let's just print the top 10
top = 10

# Let us sort them on the value in reverse order to get the highest first
for word in sorted(stat, key=stat.get, reverse=True):
    # leave out all the small words
    if len(word) > 6:
        print(word, stat[word])
        top -= 1
        if top < 0:
            break

The result of the above (done May 30th, 2020)

coronavirus 441
@CNNPolitics: 439
President 380
updates: 290
impeachment 148
officials 130
according 100
Trump's 98
Democratic 96
against 88
Department 83

The coronavirus is still the most breaking subject of today.

Next steps

  • It should be extended to have a more intelligent interpretation of the data.

Learn the Basics in PyCharm – How to Program as a Professional with Python

What is PyCharm?

PyCharm is an integrated development environment (IDE) used in computer programming, specifically for the Python language.

Learn more about it here. Where to download it?

Is it free? New to Python?

Get Started in PyCharm and Create Your First Program in less than 5 Minutes

How do you start in PyCharm? Create a project? What is that? How get from first start to running your first program in PyCharm. Want to learn more about Python?

Learn the Basics in PyCharm Debugger in 6 Minutes

In this video we are going to learn the basics in the PyCharm Debugger.

There are a lot of nice things you can do. But basically you just need a small percentage of those in order to get started. Follow me in a simple walk through debugging a Python program.

Want to learn more about debugging? Debugging is one of those tasks you hate and love. You hate when your program doesn’t do as you expect. But you love when you figure out why.

A debugger helps you in getting from HATE to LOVE.

New to Python and Programming? Check out the online course below.

Check out my Beginners Level Course on Python

Automate Posting on Facebook in Python – Follow these 7 easy steps

Overview

After this these steps you will be able to automate the process of posting on Facebook by a Python script. In this example I will show how it is done on a Facebook brand page, Learn Python With Rune.

What you need.

  • A graph API token, which you by registering as a developer on facebook and creating an App there.
  • Make a simple Python program using the facebook library

Step 1: Registering as developer at Facebook

To register as a developer at Facebook you need to log in to developer.facebook.com

You press the Log In in the top right corner and log in with you Facebook credentials.

Step 2: Create an App

You need to create an App to get the graph API token.

Under My Apps you press Create App.

Press the Manage Pages, Ads or Groups.

You enter App Display Name, which will be the name that is used when posting from this App. Hence, chose a name that you like people to see in the post.

Fill out your email (probably it is automatically there) and press Create App ID.

Step 3: Create Graph API token

Under tools choose Graph API explorer

Ensure that the right Facebook App is chosen. Then under User or Page chose get Page Access Token.

It will prompt you to log in to your Facebook account and ask permission for sharing your page.

Agree with that.

Then you will get back to this screen.

Where you want to add pages_manage_posts, that will grant you access to create posts.

Then click Generate Access Token and you will be prompted to agree with the new access rights on your Facebook page.

Step 4: Prolong you graph API token

The graph API token is quite short lived, so you want to extend it.

Press the info at the graph API token.

Then press the Open in Access Token Tool.

Where you in the bottom will find Extend Access Token. Press that.

Step 6: Install facebook-sdk library

To make you life easy in Python, you need to install the facebook-sdk library.

pip install facebook-sdk

Step 7: The Python magic

You need to insert you Access Token in the code.

Also, insert the page ID you want. You can find your Page ID with this page.

import facebook

page_access_token = "" # Replace with you access token
graph = facebook.GraphAPI(page_access_token)
facebook_page_id = "" # insert you page ID here.
graph.put_object(facebook_page_id, "feed", message='test message')

That’s it. Enjoy.

Learn How to Add Text to an Image in Python – 4 Easy Steps

Step 1: Install the Pillow library

We need the Pillow library in order to manipulate images. It can be installed by using pip.

pip install Pillow

This enables us to use PIL (Python Imaging Library) library.

Step 2: Download a font to use.

You can browse fonts free to download from Google Font Library. In this example I use the Balsamic Sans.

Place the download in the same folder as you want to work from.

Step 3: Find an awesome picture

I use Pexels to find free awesome pictures to use.

And place that image in your same location.

Step 4: Create you Python program to add your awesome text

from PIL import Image, ImageDraw, ImageFont

# Open the image (change name and location if needed)
image = Image.open('pics/background.jpg')
draw = ImageDraw.Draw(image)

# Import the font(change name and location if needed, also change font size if needed)
font = ImageFont.truetype('font/BalsamiqSans-Bold.ttf', size=200)
black = 'rgb(0, 0, 0)'  # black color

# Draw the text - change position if needed
message = "This is great!"
draw.text((1000, 200), message, fill=black, font=font)

# Draw the text - change position if needed
name = 'You are AWESOME!'
draw.text((1100, 500), name, fill=black, font=font)

image.save('greeting_card.png')

Bubble Sort Explained, Implemented in Python and Time Complexity Analysis

What will we cover?

  • What is Bubble sort and how does it work?
  • How to implement it in Python
  • How to evaluate the time complexity of the Bubble sort algorithm

What is Bubble sort and how does it work?

In this video the Bubble sort algorithm is explained. On a high level, it takes an unsorted list and returns it sorted.

How to implement Bubble sort in Python

Well, notice the above description. It is straight forward and simple to implement. That is one of the beauties of the algorithm. Simple to understand. Simple to implement.

def bubble_sort(my_list):
    for i in range(len(my_list), 1, -1):
        for j in range(1, i):
            if my_list[j-1] > my_list[j]:
                my_list[j-1], my_list[j] = my_list[j], my_list[j-1]

An simple illustration of how you can use it.

import random


def generate_random_list(n):
    my_list = []
    for i in range(n):
        my_list.append(random.randint(0, n))
    return my_list


def bubble_sort(my_list):
    for i in range(len(my_list), 1, -1):
        for j in range(1, i):
            if my_list[j-1] > my_list[j]:
                my_list[j-1], my_list[j] = my_list[j], my_list[j-1]



my_list = generate_random_list(20)
print(my_list)
bubble_sort(my_list)
print(my_list)

Evaluating the performance of Bubble sort

First of the algorithm has two for-loops. An outer and an inner for-loop. With an input of a list of length N, the loops will be iterated the following number of times.

  • Outer for-loop: N times
  • Inner for-loop: N-1 + N-2 + N-3 + … + 1 + 0 times

Knowing the formula

  • N-1 + N-2 + … + 1 + 0 = (N-1)*N/2

We can add that the time complexity of Bubble sort is O(n^2).

But let’s try to experiment with real running data, to see if we can confirm that complexity.

To get run-times from the algorithm the cProfile library comes in handy. It is easy to use and gives good insights. A simple way to get run-times is to set it like this.

import random
import cProfile


def generate_random_list(n):
    return [random.randint(0, 4*n) for i in range(n)]


def bubble_sort(my_list):
    for i in range(len(my_list), 1, -1):
        for j in range(1, i):
            if my_list[j-1] > my_list[j]:
                my_list[j-1], my_list[j] = my_list[j], my_list[j-1]


def profile_bubble_sort(n):
    my_list = generate_random_list(n)
    bubble_sort(my_list)


cProfile.run("profile_bubble_sort(10000)")

It will result in an output in the following manner.

         56372 function calls in 11.056 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   11.056   11.056 <string>:1(<module>)
        1    0.000    0.000   11.055   11.055 BubbleSortProfiling.py:16(profile_bubble_sort)
        1    0.000    0.000    0.034    0.034 BubbleSortProfiling.py:5(generate_random_list)
        1    0.008    0.008    0.034    0.034 BubbleSortProfiling.py:6(<listcomp>)
        1   11.021   11.021   11.021   11.021 BubbleSortProfiling.py:9(bubble_sort)
    10000    0.010    0.000    0.022    0.000 random.py:200(randrange)
    10000    0.005    0.000    0.027    0.000 random.py:244(randint)
    10000    0.007    0.000    0.012    0.000 random.py:250(_randbelow_with_getrandbits)
        1    0.000    0.000   11.056   11.056 {built-in method builtins.exec}
        1    0.000    0.000    0.000    0.000 {built-in method builtins.len}
    10000    0.001    0.000    0.001    0.000 {method 'bit_length' of 'int' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
    16364    0.003    0.000    0.003    0.000 {method 'getrandbits' of '_random.Random' objects}

You get the time spend in Bubble sort by looking the highlighted line. In the column cumtime you get the time spend in total in the function.

By collecting the run-time for various sizes of lists, we get the following graph.

The graph has a O(n^2) growth as expected.

How to Implement a Stack in Python and Check the Run-time Performance

We will cover the following in this article

  • What is a Stack – a short introduction
  • How to implement a Stack in Python
  • Investigate the run-time performance

What is a Stack

A Stack is a useful concept that is used in daily life, and hence, a concept that is important to understand and master as a programmer.

To understand Stacks just think of a stack of plates. There are two main operations you can do. First, you can add a plate on top of the stack. That operation is called push adds the element on top of the stack. Second, you can remove the top plate of the stack. That operation is called pop, and returns the top element of the stack.

In the diagram below a Stack is pictured. It contains of a Stack of element on the left side. The operation push of the element 03 is executed and results is pictured on the right side. Notice, that the push operation puts the element on top of the stack.

Below the operation pop is executed. Notice, that the pop operation takes from the top of the stack.

Implementation of a Stack in Python

It is a good idea to have a helper class Node that represents the elements on the stack.

class Node:
    def __init__(self, element=None, next_node=None):
        self.element = element
        self.next_node = next_node

The actual functionality of the Stack is kept in a Stack class.

class Stack:
    def __init__(self):
        self.stack = None

    def push(self, element):
        self.stack = Node(element, self.stack)

    def pop(self):
        element = self.stack.element
        self.stack = self.stack.next_node
        return element

    def is_empty(self):
        return self.stack is None

Now you can use your stack. Like the example below.

s = Stack()
for i in range(5):
    s.push(i)
while not s.is_empty():
    print(s.pop())

Will give the following output.

4
3
2
1
0

Notice the order of the element being removed from the stack by pop.

Run-time Performance

If we look at how the stack perform in order of the data size. To investigate the run-time performance the cProfile library is a good choice and simple to use. The following piece of code will help you investigate the performance.

import cProfile

def profile_stack(n):
    s = Stack()
    for i in range(n):
        s.push(i)
    while not s.is_empty():
        s.pop()


cProfile.run("profile_stack(100000)")

See the following graph.

As you see, the push and pop operations are constant, O(1), resulting in a linear performance of n push and pop operations as in the above experiment.

Comparing Performance of Python list as a Stack – How a wrong implementation can ruin performance

A Stack?

A Stack is using the principle first-in-last-out.

It is like a stack of plates. The last one you put on the top is the first one you take.

How can you implement them in Python? Well, we are in luck, you can use a Stack, and if done correctly, you will have the same performance as an actual Stack implementation will have.

But first, how can you do it wrong?

Well, you might think that the first element of the list is the top of your stack, hence in you will insert the elements on the first position, and, hence, remove them from the first position as well.

# Create a list as a stack
s = []

# Insert into the first position.
element = 7
s.insert(0, element)

# Remove from the first position.
s.pop(0)

Sounds about right?

Let’s test that and compare it with a different approach. To add the newest element to the end of the list, and, hence, remove them from the end of the list.

# Create a list and use it as stack
s = []

# Insert element in last postion
element = 7
s.append(element)

# Remove from the last position
s.pop()

Let’s check the performance of those two approaches.

Comparing the performance of the two approaches

How do you compare. You can use cProfile library. It is easy to use and informative results

See the sample code below, which compares the two approaches by create a stack each and inserting n elements to it and removing them afterwards.

import cProfile


def profile_list_as_queue_wrong(n):
    s = []
    for i in range(n):
        s.insert(0, i)
    while len(s) > 0:
        s.pop(0)


def profile_list_as_queue_correct(n):
    s = []
    for i in range(n):
        s.append(i)
    while len(s) > 0:
        s.pop()


def profile(n):
    profile_list_as_queue_wrong(n)
    profile_list_as_queue_correct(n)


cProfile.run("profile(100000)")

The results are given here.

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    5.842    5.842 <string>:1(<module>)
        1    0.078    0.078    0.107    0.107 Stack.py:12(profile_list_as_queue_correct)
        1    0.000    0.000    5.842    5.842 Stack.py:20(profile)
        1    0.225    0.225    5.735    5.735 Stack.py:4(profile_list_as_queue_wrong)
   200002    0.017    0.000    0.017    0.000 {len}
   100000    0.007    0.000    0.007    0.000 {method 'append' of 'list' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   100000    3.547    0.000    3.547    0.000 {method 'insert' of 'list' objects}
   200000    1.954    0.000    1.954    0.000 {method 'pop' of 'list' objects}
        2    0.014    0.007    0.014    0.007 {range}

Observe that the “wrong” implementation takes over 5 seconds and the “correct” takes approximately 0.1 second. Over a factor 50 in difference.

Looking into the details

If we look at the complexities given by Python, it explains it all.

The Python lists amortised complexities are given on this page.

And you notice that the append and pop (last element) are O(1), which means constant time. Constant time, means that the operations are independent on the size of the lists. That means the correct implementation gives O(n) time complexity.

On the other hand, the insert and pop(0) have linear performance. That basically means that we with the wrong implementation end up with O(n^2) time complexity.

How to Implement a Queue in Python and Compare Performance with a Python list

What will we cover in this article

  • What is a Queue?
  • Implement a Queue in Python
  • Make performance testing of it
  • Compare it with performance of a Python list

What is a Queue?

We all know what a queue is. You go to the grocery store and get spinach, strawberry and bananas for your shake. Then you see a long line of people in front of the register. That line is a queue.

The same holds in programming. You create queues to process data or input of any kind.

How to implement a Queue in Python

It is easier than you think.

First you create a Node class to represent each node in a queue. A node is an abstraction to represent a point to the next node and the actual element.

class Node:
    def __init__(self, element=None, next_node=None):
        self.element = element
        self.next_node = next_node


Then you create the class for the Queue.

class Queue:
    def __init__(self):
        self.head = None
        self.tail = None

    def enqueue(self, element):
        if self.head is None:
            self.head = self.tail = Node(element)
        else:
            n = Node(element, self.tail)
            self.tail.next_node = n
            self.tail = n

    def dequeue(self):
        element = self.head.element
        if self.tail == self.head:
            self.tail = self.head = None
        else:
            self.head = self.head.next_node
        return element

    def is_empty(self):
        return self.head is None

How does it work. Let’s make a simple example.

q = Queue()
for i in range(10):
    q.enqueue(i)

while not q.is_empty():
    print(q.dequeue())

Which will output.

0
1
2
3
4
5
6
7
8
9

Yes! You guessed it.

How do we test performance?

I like to use the cProfile library. It is easy to use and gives informative results.

So how do you test performance? You simply import the cProfile library and use the cProfile.run(…) call.

You also need to do some operations to see how your Queue performs. See the code as an example.

import cProfile


def profile_queue(n):
    q = Queue()
    for i in range(n):
        q.enqueue(i)
    while not q.is_empty():
        q.dequeue()


def profile(n):
    profile_queue(n)


cProfile.run("profile(100000)")

Which will result in the following output.

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.819    0.819 <string>:1(<module>)
   100000    0.310    0.000    0.351    0.000 Queue.py:11(enqueue)
   100000    0.308    0.000    0.308    0.000 Queue.py:19(dequeue)
   100000    0.041    0.000    0.041    0.000 Queue.py:2(__init__)
   100001    0.021    0.000    0.021    0.000 Queue.py:27(is_empty)
        1    0.132    0.132    0.819    0.819 Queue.py:34(profile_queue)
        1    0.000    0.000    0.819    0.819 Queue.py:42(profile)
        1    0.000    0.000    0.000    0.000 Queue.py:7(__init__)
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.008    0.008    0.008    0.008 {range}

The interesting line is line 9, which tells us how much time is spend in the call to profile_queue.

But is the result good?

We need to compare it to other implementations.

Performance testing the Queue with a Python list

Python lists are used for anything. Can we use a Python list as a Queue. Of course. Let’s try to implement that and compare it to our Queue.

import cProfile


def profile_queue(n):
    q = Queue()
    for i in range(n):
        q.enqueue(i)
    while not q.is_empty():
        q.dequeue()


def profile_list_as_queue(n):
    q = []
    for i in range(n):
        q.insert(0,i)
    while len(q) > 0:
        q.pop()


def profile(n):
    profile_queue(n)
    profile_list_as_queue(n)


cProfile.run("profile(100000)")

How does that compare? Let’s see.

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    3.680    3.680 <string>:1(<module>)
   100000    0.295    0.000    0.331    0.000 Queue.py:11(enqueue)
   100000    0.298    0.000    0.298    0.000 Queue.py:19(dequeue)
   100000    0.036    0.000    0.036    0.000 Queue.py:2(__init__)
   100001    0.019    0.000    0.019    0.000 Queue.py:27(is_empty)
        1    0.104    0.104    0.756    0.756 Queue.py:34(profile_queue)
        1    0.101    0.101    2.924    2.924 Queue.py:42(profile_list_as_queue)
        1    0.000    0.000    3.680    3.680 Queue.py:50(profile)
        1    0.000    0.000    0.000    0.000 Queue.py:7(__init__)
   100001    0.005    0.000    0.005    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
   100000    2.805    0.000    2.805    0.000 {method 'insert' of 'list' objects}
   100000    0.012    0.000    0.012    0.000 {method 'pop' of 'list' objects}
        2    0.004    0.002    0.004    0.002 {range}

Wow. Our Queue is way faster than the Python list.

But how is it comparing in general?

Comparing the performance of the Queue and a Python list as a Queue.

While it is difficult to see, the performance of the Queue is O(n) (linear) while the performance of the Python list as a Queue is O(n^2).

Hence, the Queue will outperform the Python list for this use case.