Learn how you can become a Python programmer in just 12 weeks.

    We respect your privacy. Unsubscribe at anytime.

    NumPy: How does Sexual Compulsivity Scale Correlate with Men, Women, or Age?

    Background

    According to wikipedia, the Sexual Compulsivity Scale (SCS) is a psychometric measure of high libido, hypersexuality, and sexual addiction. While it does not say anything about the score itself, it is based on people rating 10 questions from 1 to 4.

    The questions are the following.

    Q1. My sexual appetite has gotten in the way of my relationships.				
    Q2. My sexual thoughts and behaviors are causing problems in my life.				
    Q3. My desires to have sex have disrupted my daily life.				
    Q4. I sometimes fail to meet my commitments and responsibilities because of my sexual behaviors.				
    Q5. I sometimes get so horny I could lose control.				
    Q6. I find myself thinking about sex while at work.				
    Q7. I feel that sexual thoughts and feelings are stronger than I am.				
    Q8. I have to struggle to control my sexual thoughts and behavior.				
    Q9. I think about sex more than I would like to.				
    Q10. It has been difficult for me to find sex partners who desire having sex as much as I want to.
    

    The questions are rated as follows (1=Not at all like me, 2=Slightly like me, 3=Mainly like me, 4=Very much like me).

    A dataset of more than 3300+ responses can be found here, which includes the individual rating of each questions, the total score (the sum of ratings), age and gender.

    Step 1: First inspection of the data.

    Inspection of the data (CSV file)

    The first question that pops into my mind is how men and women rate themselves differently. How can we efficiently figure that out?

    Welcome to NumPy. It has a built-in csv reader that does all the hard work in the genfromtxt function.

    import numpy as np
    data = np.genfromtxt('scs.csv', delimiter=',', dtype='int')
    # Skip first row as it has description
    data = data[1:]
    men = data[data[:,11] == 1]
    women = data[data[:,11] == 2]
    print("Men average", men.mean(axis=0))
    print("Women average", women.mean(axis=0))
    

    Dividing into men and women is easy with NumPy, as you can make a vectorized conditional inside the dataset. Men are coded with 1 and women with 2 in column 11 (the 12th column). Finally, a call to mean will do the rest.

    Men average [ 2.30544662  2.2453159   2.23485839  1.92636166  2.17124183  3.06448802
      2.19346405  2.28496732  2.43660131  2.54204793 23.40479303  1.
     32.54074074]
    Women average [ 2.30959164  2.18993352  2.19088319  1.95916429  2.38746439  3.13010446
      2.18518519  2.2991453   2.4985755   2.43969611 23.58974359  2.
     27.52611586]
    

    Interestingly, according to this dataset (which should be accounted for accuracy, where 21% of answers were not used) women are scoring slighter higher SCS than men.

    Men rate highest on the following question:

    Q6. I find myself thinking about sex while at work.
    

    While women rate highest on this question.

    Q6. I find myself thinking about sex while at work.
    

    The same. Also the lowest is the same for both genders.

    Q4. I sometimes fail to meet my commitments and responsibilities because of my sexual behaviors.
    

    Step 2: Visualize age vs score

    I would guess that the SCS score decreases with age. Let’s see if that is the case.

    Again, NumPy can do the magic easily. That is prepare the data. To visualize it we use matplotlib, which is a comprehensive library for creating static, animated, and interactive visualizations in Python.

    import numpy as np
    import matplotlib.pyplot as plt
    data = np.genfromtxt('scs.csv', delimiter=',', dtype='int')
    # Skip first row as it has description
    data = data[1:]
    score = data[:,10]
    age = data[:,12]
    age[age > 100] = 0
    plt.scatter(age, score, alpha=0.05)
    plt.show()
    

    Resulting in this plot.

    Age vs SCS score.

    It actually does not look like any correlation. Remember, there are more young people responding to the survey.

    Let’s ask NumPy what it thinks about correlation here? Luckily we can do that by calling the corrcoef function, which calculates the Pearson product-moment correlation coefficients.

    print("Correlation of age and SCS score:", np.corrcoef(age, score))
    

    Resulting in this output.

    Correlation of age and SCS score:
    [[1.         0.01046882]
     [0.01046882 1.        ]]
    

    Saying no correlation, as 0.0 – 0.3 is a small correlation, hence, 0.01046882 is close to none. Does that mean the the SCS score does not correlate with age? That our SCS score is static through life?

    I do not think we can conclude that based on this small dataset.

    Step 3: Bar plot the distribution of scores

    It also looked like in the graph we plotted that there was a close to even distribution of scores.

    Let’s try to see that. Here we need to sum participants by group. NumPy falls a bit short here. But let’s keep the good mood and use plain old Python lists.

    import numpy as np
    import matplotlib.pyplot as plt
    data = np.genfromtxt('scs.csv', delimiter=',', dtype='int')
    # Skip first row as it has description
    data = data[1:]
    scores = []
    numbers = []
    for i in range(10, 41):
        numbers.append(i)
        scores.append(data[data[:, 10] == i].shape[0])
    plt.bar(numbers, scores)
    plt.show()
    

    Resulting in this bar plot.

    Count participants by score.

    We knew that the average score was around 23, which could give a potential evenly distribution. But it seems to be a little lower in the far high end of SCS score.

    For another great tutorial on NumPy check this one out, or learn some differences between NumPy and Pandas.

    Python for Finance: Unlock Financial Freedom and Build Your Dream Life

    Discover the key to financial freedom and secure your dream life with Python for Finance!

    Say goodbye to financial anxiety and embrace a future filled with confidence and success. If you’re tired of struggling to pay bills and longing for a life of leisure, it’s time to take action.

    Imagine breaking free from that dead-end job and opening doors to endless opportunities. With Python for Finance, you can acquire the invaluable skill of financial analysis that will revolutionize your life.

    Make informed investment decisions, unlock the secrets of business financial performance, and maximize your money like never before. Gain the knowledge sought after by companies worldwide and become an indispensable asset in today’s competitive market.

    Don’t let your dreams slip away. Master Python for Finance and pave your way to a profitable and fulfilling career. Start building the future you deserve today!

    Python for Finance a 21 hours course that teaches investing with Python.

    Learn pandas, NumPy, Matplotlib for Financial Analysis & learn how to Automate Value Investing.

    “Excellent course for anyone trying to learn coding and investing.” – Lorenzo B.

    Leave a Comment