ART

Stata is a general-purpose statistical software package created in 1985 by StataCorp. Most of its users work in research, especially in the fields of economics, sociology, political science, biomedicine, and epidemiology.[2]

Stata's capabilities include data management, statistical analysis, graphics, simulations, regression, and custom programming. It also has a system to disseminate user-written programs that lets it grow continuously.

The name Stata is a syllabic abbreviation of the words statistics and data.[3] The FAQ for the official forum of Stata insists that the correct English pronunciation of Stata "must remain a mystery"; any of "Stay-ta", "Sta-ta" or "Stah-ta" (rhymes of the three pronunciations of 'data') are considered acceptable. More recent updates indicate that Stata employees pronounce it /ˈsteɪtə/. [3]

There are four major builds of each version of Stata:[4]

Stata/MP for multiprocessor computers (including dual-core and multicore processors)
Stata/SE for large databases
Stata/IC, which is the standard version
Numerics by Stata, supports any of the data sizes listed above in an embedded environment

Small Stata, which was the smaller, student version for educational purchase only, is no longer available.

User interface

Stata has always emphasized a command-line interface, which facilitates replicable analyses. Starting with version 8.0, however, Stata has included a graphical user interface based on Qt framework which uses menus and dialog boxes to give access to nearly all built-in commands. This generates code which is always displayed, easing the transition to the command line interface and more flexible scripting language. The dataset can be viewed or edited in spreadsheet format. From version 11 on, other commands can be executed while the data browser or editor is opened.
Data structure and storage

Until the release of version 16[5], Stata could only open a single dataset at any one time. Stata holds datasets in (random-access or virtual) memory, which limits its use with extremely large datasets. This is mitigated to some extent by efficient internal storage, as there are integer storage types which occupy only one or two bytes rather than four, and single-precision (4 bytes) rather than double-precision (8 bytes) is the default for floating-point numbers.

The dataset is always rectangular in format, that is, all variables hold the same number of observations (in more mathematical terms, all vectors have the same length, although some entries may be missing values).
Data format compatibility

Stata can import data in a variety of formats. This includes ASCII data formats (such as CSV or databank formats) and spreadsheet formats (including various Excel formats).

Stata's proprietary file formats have changed over time, although not every Stata release includes a new dataset format. Every version of Stata can read all older dataset formats, and can write both the current and most recent previous dataset format, using the saveold command.[6] Thus, the current Stata release can always open datasets that were created with older versions, but older versions cannot read newer format datasets.

Stata can read and write SAS XPORT format datasets natively, using the fdause and fdasave commands.

Some other econometric applications, including gretl, can directly import Stata file formats.
Extensibility

Stata allows user-written commands, distributed as so-called ado-files, to be straightforwardly downloaded from the internet which are then indistinguishable to the user from the built-in commands. In this respect, Stata combines the extensibility more often associated with open-source packages with features usually associated with commercial packages such as software verification, technical support and professional documentation. Some user-written commands have later been adopted by StataCorp to become part of a subsequent official release after appropriate checking, certification, and documentation.
User community

Stata had an active email list from August 1994 ("Statalist", over 1000 messages per month) which was turned into a web forum in March 2014 and is still called "Statalist".[3] StataCorp employees regularly contribute to Statalist. It is maintained by Marcello Pagano of the Harvard School of Public Health, and not by StataCorp itself.

Articles about the use of Stata and new user-written commands are published in the quarterly peer-reviewed Stata Journal. The Stata Journal is a quarterly publication containing articles about statistics, data analysis, teaching methods, and effective use of Stata's language.

User Group meetings are held annually in the United States (the Stata Conference), the UK, Germany, and Italy, and less frequently in several other countries. Only the annual Stata Conference held in the United States is hosted by StataCorp LP. Local Stata distributors host User Group meetings in their own countries, however, Stata developers frequently travel to and present at these meetings. Established under the Societies Act on 10 May 2008, Singapore Stata Users Group is the world's first government-approved users group (Registration No: 2048/2008; Unique Entity No: T08SS0091A). Its slogan is "Shaping Data Meaningfully". As a non-profit organisation, StataUGS does not organise regular meetings but provides programming and statistical advice to users in Singapore through informal means. The active members of StataUGS are mostly engaged in biomedical research.
Example Stata code

To perform a linear (OLS) regression of y on x:

regress y x [if]

The optional part if allows to restrict the sample used in the command to a subset. For example, if the command should only be applied to the females in the sample, one could specify: if female == 1.

To perform logistic regression of y on x:

logistic y x

To display a scatter plot of y against x restricted to values of x below 10:

scatter y x if x < 10

To perform OLS regression of y on x with White's heteroscedasticity-consistent standard errors:

regress y x, vce(robust)

To calculate Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) for regression:[7]

estat ic

To code "fizzbuzz":

program define fizzbuzz
	args x
 	forvalues i = 1/`x' {
  		if mod(`i',15) == 0 {
   			display "fizzbuzz"
   		}
  		else if mod(`i',5) == 0 {
  			display "buzz"
  		}
  		else if mod(`i',3) == 0 {
  			display "fizz"
  		}
  		else {
		    display `i'
   		}
 	}
end

Timeline of releases

Since 2000, StataCorp have released a new major release of Stata (incrementing the integer part of the version number) roughly every two years. Users must pay a fee if they wish to upgrade to the latest major release. Minor releases (incrementing the decimal part of the version number) are sometimes made available between major releases. These are available as free downloadable updates to those who have a licence for the previous major release. Dates of all releases are available on the Stata website.[8] Stata 16 was released on June 26, 2019.

Stata's versioning system is designed to give a very high degree of backward compatibility, ensuring that code written for previous releases continues to work.[9] However, users should be careful when they save or open data among different versions.
See also

List of statistical packages
Comparison of statistical packages
Data analysis

References

"Stata Journal | Article". www.stata-journal.com.
"Who uses Stata?". Stata. Retrieved 2017-06-28.
"Help - Statalist". www.statalist.org.
"Which Stata is right for me?". Stata. Retrieved 2010-04-04.
"Data frames: multiple datasets in memory". www.stata.com. Retrieved 2020-08-13.
"Stata 16 help for save". www.stata.com.
"Choosing Regression Model in Stata".
"Stata | FAQ: History of Stata". www.stata.com.

"Stata 16 help for version". www.stata.com.

Further reading

Bittmann, Felix (2019). Stata - A Really Short Introduction. Boston: DeGruyter Oldenbourg. ISBN 978-3-11061-729-0.
Pinzon, Enrique, ed. (2015). Thirty Years with Stata: A Retrospective. College Station, Texas: Stata Press. ISBN 978-1-59718-172-3.
Hamilton, Lawrence C. (2013). Statistics with STATA. Boston: Cengage. ISBN 978-0-84006-463-9.

External links

Official website
Stata Journal
Stata Press


Undergraduate Texts in Mathematics

Graduate Texts in Mathematics

Graduate Studies in Mathematics

Mathematics Encyclopedia

World

Index

Hellenica World - Scientific Library

Retrieved from "http://en.wikipedia.org/"
All text is available under the terms of the GNU Free Documentation License