By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
SmartData CollectiveSmartData Collective
  • Analytics
    AnalyticsShow More
    data Analytics instagram stories
    Data Analytics Helps Marketers Make the Most of Instagram Stories
    15 Min Read
    analyst,women,looking,at,kpi,data,on,computer,screen
    What to Know Before Recruiting an Analyst to Handle Company Data
    6 Min Read
    AI analytics
    AI-Based Analytics Are Changing the Future of Credit Cards
    6 Min Read
    data overload showing data analytics
    How Does Next-Gen SIEM Prevent Data Overload For Security Analysts?
    8 Min Read
    hire a marketing agency with a background in data analytics
    5 Reasons to Hire a Marketing Agency that Knows Data Analytics
    7 Min Read
  • Big Data
  • BI
  • Exclusive
  • IT
  • Marketing
  • Software
Search
© 2008-23 SmartData Collective. All Rights Reserved.
Reading: Python and Productivity
Share
Notification Show More
Aa
SmartData CollectiveSmartData Collective
Aa
Search
  • About
  • Help
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
SmartData Collective > Uncategorized > Python and Productivity
Uncategorized

Python and Productivity

Editor SDC
Last updated: 2009/04/22 at 11:26 AM
Editor SDC
6 Min Read
SHARE

One of the main benefits of programmability is the ability to extend and automate SPSS Statistics capabilities. I’d like to tell you the story of a recent extension effort: the SPSSINC TURF command and dialog.

TURF analysis is Total Unduplicated Reach and Frequency. It is a common technique in market research. Suppose you […]

One of the main benefits of programmability is the ability to extend and automate SPSS Statistics capabilities. I’d like to tell you the story of a recent extension effort: the SPSSINC TURF command and dialog.

More Read

big data improves

3 Ways Big Data Improves Leadership Within Companies

IT Is Not Analytics. Here’s Why.
Romney Invokes Analytics in Rebuke of Trump
WEF Davos 2016: Top 100 CEO bloggers
In Memoriam: Robin Fray Carey

TURF analysis is Total Unduplicated Reach and Frequency. It is a common technique in market research. Suppose you have a survey about sports viewing popularity. It asks about football, soccer, baseball, basketball, hockey, and other sports. You would like to know how to reach the most viewers with no more than three sports.

You could tabulate with FREQUENCIES the positive responses to each sport. But this doesn’t answer the question, because the audiences will overlap. You would like to know the highest reach of combinations of up to three sports eliminating the overlap.

Calculating the TURF requires finding the set union for all combinations up to a certain size of positive responses to the sports and then presenting the best of those combinations. That is a computationally demanding task that grows explosively as the number of questions increases, but it is conceptually simple.

SPSS Statistics does not have a built-in way to do this, so I set out to create an extension command implemented in Python for it: SPSSINC TURF. First, I decided to work with transposed data and the built-in set algebra capabilities of Python. I pass the question dataset and create a set for each question listing the case numbers that have positive responses. That’s just a few lines of code.

The trickier part was figuring out how to manage all the set union calculations. It’s a set of tree structures for which a little bit of recursion boils the work down to a few lines of code. My first try was getting clumsy, so I went out for a bike ride for a few hours and came back with the algorithm worked out in my head. I believe in the left-brain, right-brain approach: study something intensively; then relax or do something different, and things are much clearer when you return to the subject.

Putting this together, I finished the code, but I was worried that this task would be so computationally demanding that it would be too slow to be useful. As it turned out, though, the approach I took, heavily leveraging Python sets and some other features, runs amazingly fast. And although the sets have to fit in memory, it seems to handle pretty large problems.

I went on to create a dialog box interface using the Version 17 Custom Dialog Builder and extension command syntax using the extension mechanism, which requires a small xml file to define the syntax and uses our extension.py module to handle that interface.

So, what sort of effort did this take? Less than one day, including the bike ride. How much more productive could you be? Taking advantage of the combination of Python and SPSS together along with the CDB and other tools reduced this task to about 225 lines of code plus the dialog and xml.

I posted this to SPSS Developer Central, where it can be downloaded for free. It is written for SPSS Statistics 17, but it will work with version 16 (not including the dialog) with a small change documented in the readme file. One competing product that does this as a main feature sells for a 4-figure price.

The original version posted had a subset of the features I had thought about doing. I wanted to see what interest there might be. Within a few days I had received and implemented a few enhancement requests. By getting the first version out to the world, it was easier to see what additional features users might want. Again, higher productivity by not implementing things that would probably not be used. But maybe I’ll do more later.

This experience is typical of many programmability projects, in my experience. Big results for small amounts of work. Of course, I’ve done this a lot, so I know all the tools and how to approach a problem. Programmability definitely requires an investment in learning the technology, but it’s hard to beat the ROI.

Editor SDC April 22, 2009
Share This Article
Facebook Twitter Pinterest LinkedIn
Share

Follow us on Facebook

Latest News

ai low code frameworks
AI Can Help Accelerate Development with Low-Code Frameworks
Artificial Intelligence
data Analytics instagram stories
Data Analytics Helps Marketers Make the Most of Instagram Stories
Analytics
data breaches
How Hospital Security Breaches Devastate Local Communities
Policy and Governance
analyst,women,looking,at,kpi,data,on,computer,screen
What to Know Before Recruiting an Analyst to Handle Company Data
Analytics

Stay Connected

1.2k Followers Like
33.7k Followers Follow
222 Followers Pin

You Might also Like

big data improves
Big DataJobsKnowledge ManagementUncategorized

3 Ways Big Data Improves Leadership Within Companies

6 Min Read
Image
Uncategorized

IT Is Not Analytics. Here’s Why.

7 Min Read

Romney Invokes Analytics in Rebuke of Trump

4 Min Read

WEF Davos 2016: Top 100 CEO bloggers

14 Min Read

SmartData Collective is one of the largest & trusted community covering technical content about Big Data, BI, Cloud, Analytics, Artificial Intelligence, IoT & more.

giveaway chatbots
How To Get An Award Winning Giveaway Bot
Big Data Chatbots Exclusive
AI and chatbots
Chatbots and SEO: How Can Chatbots Improve Your SEO Ranking?
Artificial Intelligence Chatbots Exclusive

Quick Link

  • About
  • Contact
  • Privacy
Follow US
© 2008-23 SmartData Collective. All Rights Reserved.
Go to mobile version
Welcome Back!

Sign in to your account

Lost your password?