Stata on a Mac: Step 4: Importing unformatted data

If you're lucky, you'll never have to deal with unformatted data. You'll just cruise to ICPSR or the local library and get your data in pre-formatted, machine-readable files with no errors, complete dictionary and label files, and lots of documentation.

That's exactly what I did for four years. Then I ran across an old, useful piece of data that had never before been put into a statistics package.

That's an exciting moment. While the General Social Survey has been combed over (and over and over) by social scientists, including me, it's nice to find a new dataset that hasn't been mined very heavily. Either that means it's a dud, or just that no one bothered with it and you may find something new.

Here's what the data looked like when I opened it:

0001124352540000000000000000000000000000000000000
000222223225202022222322120200001222232322520202223232212020
371141530500000000000000140010021803

Hmm.

No commas, no spaces, and no file to read all of this into Stata (or SPSS or anything else).

However, there was a paper codebook that explained what all of this meant. The first four digits are the ID; the next is the card number; the next is the opinions of Fraternal Order of Police members on the severity of illegal bookmaking on a 1-to-4 Likert scale. And so on.

infix dictionary {
* imports Fraternal Order of Police Study from 1975 Gambling Commission Study
2 lines
1:
ID 1-4

card 5

serbook 7

serlarc 8

sernumb 9

serpot 10

sercard 11

serfence 12

serhook 13

And we're off! By the way, "2 lines" is a Stata command indicating that a data records runs longer than a single line.

From here there are two ways to go:

First way:

Save the file as a .dct file, meaning it's a data dictionary

Use the infix menu in Stata (under File)

Specify the location of the raw data file and the dictionary file:

Stata File Menu

Picture 9.jpg

Second way (preferred for me):

Paste the raw data into the file below the dictionary

Save the file as a .do file

Run it in Stata

At this point, everything should be fine. Of course, you'll make mistakes the first few times, and need to re-run things.

Next time I'll address Step 5: Logging your sessions.

Stata on a Mac

Wednesday, November 28, 2007

Step 4: Importing unformatted data

No comments:

Subscribe Now

MacStata Online Resources

Mac Freeware Links

Blog Archive