IMPORTING/EXPORTING DATA
- Import and export data from Excel .xls and .xlsx files
- Import and export CSV and delimited data
- Copy/paste data from spreadsheets
- Input data in spreadsheet editor
- Read from and write to SQL sources with ODBC (see below)
- Import and export fixed-format data using a dictionary
- Import and export any type of text data
- Unicode (UTF-8) support, including conversion from/to extended ASCII
- Import EBCDIC data and convert EBCDIC to ASCII
- Import SAS files
- Import SPSS files
- Import and export data in the format required by the FDA for NDA submittals
- Import and export SAS Transport XPORT files
- Import Federal Reserve Economic Data
- Import from Haver Analytics databases
- Import data from Wharton Research Data Services (WRDS) via JDBC
- Import and export dBase files
- High-level import/export of full Excel worksheets
- Low-level cell-by-cell access to write results to and read data from Excel, including graphs, formulas, date formats, currency formats, bold, italics, and more
- Export data to SPSS New
Video – Copy/Paste from Excel® into Stata
Video – Importing delimited data
JDBC SUPPORT
- Import data from Oracle, Microsoft SQL Server, MySQL, Amazon Redshift, Snowflake, and other databases
- Export data to an existing database table
- Execute SQL statements on a database
- Create data source names to store connection settings
- Support for CLOBs, BLOBs, and Unicode
- Import data using GUI New
ODBC SUPPORT
- Import data from any ODBC data source, such as Oracle, SQL Server, Access, Excel, MySQL, and DB2
- Export data to new or existing ODBC tables
- Execute custom SQL commands individually or in batches
- Customize ODBC connection strings
- Support for ODBC
- Support for VARCHARs/CLOBs and BLOBs
- Support for Unicode
BUILT-IN SPREADSHEET EDITOR
- Clipboard Preview Tool lets you control how data will be pasted
- Manage variables with the Variables Tool
- For Windows, Mac, and Unix
- Pinnable rows and columns New
- Resizable cell editor for string data New
- Tool tips for truncated text New
- Proportional width font supportNew
- Columns can be resized and are preserved when saving the dataset
- Show variable labels in column header New
- Keyboard shortcut for hiding and showing value labels New
PROPERTIES WINDOW
- Manage variables
- Manage dataset properties
- For Windows , Mac, and Unix
VARIABLES MANAGER
- Change storage types, names, and formats
- Add and edit value labels
- Attach notes to variables
- Filter variables
- For Windows, Mac, and Unix
Video – Label the values of categorical variables
Video – Change the display format of a variable
Video – Add notes to a variable
Video – Change the display format of a variable
FUNCTIONS
- Statistical functions
- Mathematical functions
- Trigonometric functions
- String functions
- Unicode functions
- Regular expressions
- Advanced regular expression functions
- Date and time functions
- Durations, relative dates, datetime components
- Week-related functions
- Time-series functions
- Random-number functions
- 18 functions
- Stream random numbers
- Matrix functions
- Programming functions
DATA REORGANIZATION
- Row–column transposition
- Data reshaping
- Stacking of variables
- Collapsing into means, totals, etc.
Video – Reshape data from long format to wide format
UNICODE SUPPORT
- UTF-8
- Translation of extended ASCII to UTF-8
- Unicode-aware string functions
- Locale-based sorting and string comparison
- Dataset labels
- Variable labels
- Value labels (e.g., male and female for 0 and 1)
© Copyright 1996–2024 StataCorp LLC. All rights reserved.
- Ability to switch between multiple sets of data, variable, and value labels
- Missing-value labels
- Support for multiple languages, including Unicode support
Notes
- Extensive notes can be attached to a dataset
Data snapshots
- Allow multiple levels of undo to modified datasets
Multiple datasets in memory (frames)
- Link frames
- Copy data between frames
- Access data in other frames
- Post simulation results to frame
- Manipulate frames
- Access frames from Mata
- Access frames from Mata
- Save, load, describe multiple frames
Automatic memory management
- Terabytes of RAM supported
- Up to 120,000 variables in Stata/MP; up to 32,767 variables
in Stata/SE - 20 billion or more observations in Stata/MP
- Up to 2.1 billion observations in Stata/SE and Stata/BE
Sorting
- Ascending or descending sorts
- Multiple-key sorts
- Numeric and string sorts
- Locale-aware Unicode string sorting and comparison
Combining datasets
- Merge datasets
- By key variables
- By observations
- Join datasets
- Outer join
- Append datasets
- Append time series
Video – How to append files into a single dataset.
Special datasets
- Longitudinal data/panel data
- Survival/duration data
- Time series
- Survey data
- Multiple imputations
- Discrete choice data
- Spatial data
Utilities
- Count number of observations that satisfy specified conditions
- Formatted and unformatted disk I/O
- Zip-file support
- Unicode conversion from/to extended ASCII
- Custom filters to manipulate text files
Variable management
- Generation of new variables
- Replacement of existing variables
- Renaming variables
- Encoding and decoding string variables
- Reordering variables in dataset
- Variables Manager
Video – Convert categorical string variables to labeled numeric variables.
Video – Create a categorical variable from a continuous variable.
Video – Convert a string variable to a numeric variable.
Video – Identify and replace unusual data values.
Video – Convert missing value codes to missing values.
Dataset utilities
- Flexible description of variables, labels, and types
- List values of variables
- Data signatures to verify the integrity of datasets
- Codebooks for variables
- Value-label reports
- Duplicates and missing values tables
- Compress (make dataset as small as possible without loss of accuracy)
Variable types
- Numeric storage types
- Byte
- Integer (int)
- Long
- Float
- Double
- String (including Unicode, very long strings and BLOBs)
- Dates and times
- Business calendars
Long string support
- Up to 2 billion character long strings
- Coalescing of duplicate values to save memory
- Binary ‘strings’ (BLOBs)
- Import and export entire files into long strings/BLOBs
- Unicode (UTF-8) strings
Stored results
- Save results to disk for later use
- Store estimation results in memory
- Create tables to compare results
- Create custom tables