Sas Read in Multiple Text Files Name Not in

A jargon-gratuitous, easy-to-learn SAS base course that is tailor-fabricated for students with no prior knowledge of SAS.

How to Import Text Files into SAS

Text files are a mutual file format to apply when importing or exporting data from 1 data source for some other. When importing text files from other data sources or databases, there are many variations in the information construction and delimiters that i tin come beyond.

 This commodity aims to address some of the more common challenges that arise when attempting to import unlike variations of text files into SAS. A few different tips and methods are too provided along the manner.

 Topics covered include:

  1. Importing tab-delimited text files with PROC IMPORT
  2. Importing special grapheme delimited text files with PROC IMPORT
  3. Importing infinite-delimited text files with PROC IMPORT
  4. Using PROC IMPORT to Generating Data Footstep code for importing text files

Software

Before nosotros go along, brand sure yous have admission to SAS Studio. It'southward costless!

Data Sets

The examples used in this commodity are based on the text files listed below. The text files are derived from the SASHELP datasets including CARS and ORSALES datasets:

  1. Cars_tab.txt - download
  2. Cars_pipe.txt - download
  3. Orsales_space.txt - download

Before running whatever of the examples below, yous volition need to replace the path '/home/your_username/SASCrunch' with a directory that you have read/write admission to in your environs.

This can vary depending on the machine you are running SAS on, the version of SAS yous are running and the Operating System (Bone) you are using.

If you lot are using SAS OnDemand for Academics, you must start upload the files to the SAS server.

Delight visit here for instruction if you are not sure how to practise this.

1. Importing a Tab-delimited Text File with PROC IMPORT

With a tab-delimited text file, the variables (columns) are separated by a tab and the files typically end with a ".txt" extension.

In this example, the input file is the cars_tab.txt file. This is a text file based on the SASHELP.CARS dataset.

The first part you need following the PROC IMPORT statement is the datafile  statement. The datafile argument is required so that SAS knows where the file you would like to import is stored and what the name of that file is. Inside the quotation marks following the datafile argument, you need to add the complete path, including the filename and file extension. Be sure to supersede '/home/your_username/SASCrunch' with the correct directory on your machine or environment where cars_tab.txt is saved. In this example, "/dwelling house/your_username/SASCrunch" is the path, "cars_tab" is the filename, and ".txt" is the file extension.

To import tab-delimited text files, both the DBMS and DELIMITER options will demand to be used. The DBMS value used for this case is DLM. The DLM value tells SAS that you would similar to specify a custom delimiter for the dataset.

After closing off the PROC IMPORT statement with a semi-colon, a second option, DELIMITER is added. The value of DELIMITER for a tab-delimited file is '09'x, which is the hexadecimal representation of a TAB on an ASCII platform.

Finally, the replace option is included to allow for multiple re-runs and overwrites of the CARS_TAB dataset in WORK. If yous adopt not to overwrite the newly imported SAS dataset, yous can only remove the supersede option.

Using these parameters, the post-obit code will import the tab-delimited cars_tab.txt file and output a SAS dataset in WORK called CARS_TAB:

proc import datafile  = '/home/your_username/SASCrunch/cars_tab.txt'
 out = cars_tab
dbms  = dlm
 supplant;
delimiter = '09'x;
run;

After running the above code, you will notice something is a bit off with the output dataset:

If you were to open up up the cars_tab.txt file directly using Notepad, Wordpad, TextEdit or like on your computer, you would notice that this file has a actress row of invalid data in information technology. This type of situation oft occurs when the text file is created from another information source.

 Fortunately, SAS provides an option that y'all can add to your PROC IMPORT argument to skip this extra line of information that you lot don't need. Past adding the datarow option, yous can let SAS know at which row the data (observations) start. In this case, we know that the first row has the headings, the 2d row has no data, and the observations start on the third row, so we set datarow = 3 :

proc import datafile = '/dwelling house/your_username/SASCrunch/cars_tab.txt'
 out = cars_tab
 dbms = dlm
 supersede
 ;
 delimiter = '09'10;
datarow = iii;
run;

In the output information shown partially beneath, yous volition see that extra row has at present been removed:

Do you have a hard time learning SAS?

Take our Applied SAS Preparation Course for Absolute Beginners and learn how to write your first SAS program!

2. Importing Text Files Delimited with Special Characters

Since text files can contain whatever number of special characters every bit delimiters, the DELIMITER statement be used with merely about whatsoever keyboard character.

 For example, if the values of a text file are delimited with the pipe bar "|", you can simply specify the pipe bar symbol in the DELIMITER  statement, like to how we used '09'x for tab-delimited files. In this example, the cars_pipe.txt file is read in to create the CARS_PIPE SAS dataset in the WORK library:

proc import datafile = '/habitation/your_username/SASCrunch/cars_pipe.txt'
 out = cars_pipe
 dbms = dlm
 supersede
 ;
delimiter = '|';
run;

After updating the path in the datafile statement and running the higher up code, you will detect that while the columns take been read in correctly, the variable names are non correct and actual values are beingness used as the variable names:

If you were to open up the cars_pipe.txt file directly using Notepad, Wordpad, TextEdit or similar text editors on your computer, you lot would notice that this text file has no cavalcade headings and the information starts directly in the first row.

 To go effectually this, you need to let SAS know that there are no column headings provided in the input text file. By default, there is a GETNAMES selection in PROC IMPORT which is set to Aye. With this setting equal to Yeah,  SAS assumes that the first row of information contains the cavalcade headings, which ultimately end upwards as the SAS variable names. When this is not the case, simply set GETNAMES = NO to let SAS know there are no column headings provided in the input file:

proc import datafile = '/dwelling house/your_username/SASCrunch/cars_pipe.txt'
 out = cars_pipe
 dbms = dlm
 replace
 ;
 getnames = no;
 delimiter = '|';
run;

Now in the output data, all the records will be found in the dataset itself, but the heading names will have generic names from VAR1 up to VAR15 in this case, since in that location are 15 columns:

To fix the variable names, you could for case utilise the SAS Information Step with the RENAME statement to create a new dataset. Every bit an example, the following dataset code would create a dataset chosen CARS_PIPE_CLEAN, which uses the RANEM statement to prepare the appropriate variable names equally shown here:

data cars_pipe_clean;
 set cars_pipe;
 rename           var1 = make
                          var2 = model
                          var3 = type
                          var4 = origin
                        /*var5 = ...
                          var15 = ...*/
                             ;
run;

three. Importing Infinite-delimited Text Files with PROC IMPORT

Space-delimited text files are nonetheless some other common file type you may meet that you would similar to import into SAS. By default, setting DBMS = DLM with your PROC IMPORT statement volition utilize space as the delimiter, and so you don't need to explicitly employ the delimiter option in this example.

 For example, the orsales_space.txt text file contains space-delimited columns, and tin be imported into SAS with DBMS = DLM :

proc import datafile = '/home/your_username/SASCrunch/orsales_space.txt'
 out = orsales
dbms = dlm
 replace
 ;
run;

At first glance, it appears that the import was successful and the ORSALES dataset was successfully created in Piece of work as shown partially hither:

Withal, if yous run a PROC FREQ (code provide below) on the Product_Line variable, you will discover that ane of the values for Product_Category is truncated:

proc freq data = orsales;
 tables product_category;
run;

Every bit shown in the Results, "Assorted Sports Articles" is at present only "Contrasted Sports A" in this newly imported dataset:

This blazon of situation can often occur when importing datasets into SAS considering PROC IMPORT volition only check a portion of the records before determining what the advisable variable type and lengths should be on the output SAS dataset.

 The solution to this trouble is to include the GUESSINGROWS option with your PROC IMPORT call. By specifying a number for GUESSINGROWS, yous can tell SAS how many rows it should browse in your incoming dataset before determining what the appropriate length and variable types should be.

 In this example import, there are 912 rows of data. Hither, past setting GUESSINGROWS = 912  we can exist certain that SAS volition pick the largest width necessary to avoid truncation of whatsoever information when it completes the import. A new dataset, ORSALES_GUESSINGROWS, is and then created so you tin see the difference in results:

proc import datafile = '/domicile/your_username/SASCrunch/orsales_space.txt'
 out = orsales_guessingrows
 dbms = dlm
 replace
 ;
 guessingrows = 912;
run;

By running a PROC FREQ to generate a frequency tabular array on the newly created dataset, we can test whether or not the GUESSINGROWS pick was constructive:

proc freq information = orsales_guessingrows;
 tables product_category;
run;

Every bit y'all can see from the output, the Product_Category value "Contrasted Sports Articles" now shows up correctly and is no longer truncated:

It's important to note that GUESSINGROWS can be extremely computationally intensive and may significantly slow downward the time it takes to import your dataset to SAS. The larger the value y'all gear up for GUESSINGROWS, the longer the processing will have, only more reliable the results will exist. The run time volition of course depend on your surroundings, the number of records and the number of variables found in your data.

4. Importing a Tab-delimited File using Information Stride

Although the amount of SAS code required to import a Text file using Information Step is longer than the lawmaking required for PROC IMPORT, using Data Step lawmaking allows for greater flexibility.

By using Data Step lawmaking, the variable names, lengths and types can be manually specified at the time of import. The advantage is that this allows you to format the dataset exactly the way you want equally shortly as it is created in SAS, rather than having to make additional modifications later on.

First, every bit with whatever SAS Information Footstep code, y'all need to specify the name and location for the dataset you are going to create. Here, a dataset named CARS_DATASTEP will exist created in the WORK directory.

The next step is to use the INFILE statement. The INFILE statement in this case is made upward of 6 components:

  1. The location of the Text file –  /habitation/your_username/SASCrunch in this example
  2. Delimiter pick – the delimiter institute on the input file enclosed in quotation marks (delimiter is '09'x in this instance since it is a tab-delimited file)
  3. MISSOVER choice – Tells SAS to go along reading the same tape even if a missing value is establish for ane of the variables
  4. FIRSTOBS – The first row that contains the observations in the input file (Set to three in this example since the observations showtime on the third row in the cars_tab.txt file)
  5. DSD – Tells SAS that when a delimiter is plant within a quotation marking in the dataset, it should be treated equally a value and not a delimiter
  6. LRECL – Maximum length for an entire record (32767 is the default maximum to utilise which volition ensure no truncation within 32767 characters)

After the INFILE statement, the simplest mode to ensure that your variable names, lengths, types and formats are specified correctly is to use a format statement for each variable. After an appropriate format has been assigned to each variable, the variables that you would similar to import should be listed in guild later an INPUT argument. Note that character variables should take a dollar sign ($) later each variable name.

Annotation that yous can also specify INFORMATs and LENGTHs optionally hither, but in most cases the FORMAT and INPUT statements should be all you need for a successful import.

Beneath is the Data Stride lawmaking that would successfully import the cars_tab.txt file into a SAS dataset. As mentioned, be sure to update the path to the correct location of the cars_tab.txt file in your environment before running the following lawmaking:

data piece of work.cars_datastep_tab;
infile  '/habitation/your_username/SASCrunch/cars_tab.txt'
delimiter ='09'x
      missover
firstobs =2
DSD
 lrecl  = 32767;

        format Brand $5. ;
        format Model $xxx. ;
        format Type $half dozen. ;
        format Origin $6. ;
        format DriveTrain $5. ;
        format MSRP $9. ;
        format Invoice $9. ;
        format EngineSize best12. ;
        format Cylinders best12. ;
        format Horsepower best12. ;
        format MPG_City best12. ;
        format MPG_Highway best12. ;
        format Weight best12. ;
        format Wheelbase best12. ;
        format Length best12. ;
input
                 Make $
                 Model $
                 Type $
                 Origin $
                 DriveTrain $
                 MSRP $
                 Invoice $
                 EngineSize
                 Cylinders
                 Horsepower
                 MPG_City
                 MPG_Highway
                 Weight
                 Wheelbase
                 Length
     ;
 run;

After running the higher up code, you should meet the CARS_DATASTEP_TAB data set, shown partially here:

Become a Certified SAS Specialist

Get access to two SAS base certification prep courses and 150+ practise exercises

5. Generating Data Footstep Code with PROC IMPORT

When the variable names, types, lengths or formats that SAS is automatically generating with PROC IMPORT are not what you are looking for, and you don't want to type out forty+ lines of code as in the previous example, PROC IMPORT can still be a time-saving tool.

 Going dorsum to the cars_pipe.txt text file, retrieve that this text file did not contain column headings.

 Re-run the following code to import cars_pipe.txt into SAS and create a temporary dataset, CARS_PIPE to be stored in WORK:

proc import datafile = '/home/your_username/SASCrunch/cars_pipe.txt'
 out = cars_pipe
 dbms = dlm
 supervene upon
 ;
 getnames = no;
 delimiter = '|';
run;

After running the above code, go to the Log that is created and find that SAS Data Step code is actually being generated as a upshot of the PROC IMPORT:

By simply copying and pasting this code from your log into your SAS plan, you lot can now apply this lawmaking as a template to first your Data Footstep code, modifying it equally needed to adjust variable names, types and lengths.

 For example, you tin supplant the variable names VAR1-VAR15 with the original variable names from CARS, as shown hither:

data Work.CARS_PIPE_CUSTOM    ;
infile '/domicile/your_username/SASCrunch/cars_pipe.txt' delimiter  =  '|' MISSOVER DSD lrecl = 32767;
     informat make $v. ;
     informat model $xxx. ;
     informat blazon $6. ;
     informat origin $six. ;
     informat drivetrain $v. ;
     informat msrp nlnum32. ;
     informat invoice nlnum32. ;
     informat enginesize best32. ;
     informat cylinders best32. ;
     informat horsepower best32. ;
     informat mpg_city best32. ;
     informat mpg_highway best32. ;
     informat weight best32. ;
     informat wheelbase best32. ;
     informat length best32. ;
     format brand $5. ;
     format model $30. ;
     format blazon $6. ;
     format origin $6. ;
     format drivetrain $5. ;
     format msrp nlnum12. ;
     format invoice nlnum12. ;
     format enginesize best12. ;
     format cylinders best12. ;
     format horsepower best12. ;
     format mpg_city best12. ;
     format mpg_highway best12. ;
     format weight best12. ;
     format wheelbase best12. ;
     format length best12. ;
  input
              brand $
              model $
              type $
              origin $
              drivetrain $
              msrp
              invoice
              enginesize
              cylinders
              horsepower
              mpg_city
              mpg_highway
              weight
              wheelbase
              length
  ;
run;

Afterwards running the higher up code, a new dataset Work.CARS_PIPE_CUSTOM is created by importing the cars_pipe.txt text file using the SAS Information Stride lawmaking we generated using PROC IMPORT.

Master SAS in 30 Days

Inline Feedbacks

View all comments

iconmail

Go latest articles from SASCrunch

SAS Base of operations Certification Test Prep Class

Two Certificate Prep Courses and 300+ Practice Exercises

hillsubtal.blogspot.com

Source: https://sascrunch.com/importing-text-files/

0 Response to "Sas Read in Multiple Text Files Name Not in"

Enregistrer un commentaire

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel