Updating search results...

Search Resources

366 Results

View
Selected filters:
  • analysis
The Programming Historian 2: Output Data as an HTML File
Unrestricted Use
CC BY
Rating
0.0 stars

This lesson takes the frequency pairs created in Counting Frequencies and outputs them to an HTML file.

Here you will learn how to output data as an HTML file using Python. You will also learn about string formatting. The final result is an HTML file that shows the keywords found in the original source in order of descending frequency, along with the number of times that each keyword appears.

Subject:
Applied Science
Computer Science
Material Type:
Diagram/Illustration
Provider:
Center for History and New Media
Author:
William J. Turkel and Adam Crymble
Date Added:
06/16/2015
The Programming Historian 2: Output Keywords in Context in HTML File
Unrestricted Use
CC BY
Rating
0.0 stars

This lesson builds on Keywords in Context (Using N-grams), where n-grams were extracted from a text. Here, you will learn how to output all of the n-grams of a given keyword in a document downloaded from the Internet, and display them clearly in your browser window.

Subject:
Applied Science
Computer Science
Material Type:
Diagram/Illustration
Provider:
Center for History and New Media
Author:
William J. Turkel and Adam Crymble
Date Added:
06/16/2015
The Programming Historian 2: Python Introduction and Installation
Unrestricted Use
CC BY
Rating
0.0 stars

Downloading a single record from a website is easy, but downloading many records at a time – an increasingly frequent need for a historian – is much more efficient using a programming language such as Python. In this lesson, we will write a program that will download a series of records from the Old Bailey Online using custom search criteria, and save them to a directory on our computer. This process involves interpreting and manipulating URL Query Strings. In this case, the tutorial will seek to download sources that contain references to people of African descent that were published in the Old Bailey Proceedings between 1700 and 1750.

Subject:
Applied Science
Computer Science
Material Type:
Diagram/Illustration
Provider:
Center for History and New Media
Author:
Adam Crymble
Date Added:
06/16/2015
The Programming Historian 2: Python Introduction and Installation
Unrestricted Use
CC BY
Rating
0.0 stars

This first lesson in our section on dealing with Online Sources is designed to get you and your computer set up to start programming. We will focus on installing the relevant software – all free and reputable – and finally we will help you to get your toes wet with some simple programming that provides immediate results.

In this opening module you will install the Python programming language, the Beautiful Soup HTML/XML parser, and a text editor. Screencaps provided here come from Komodo Edit, but you can use any text editor capable of working with Python. Here’s a list of other options: Python Editors. Once everything is installed, you will write your first programs, “Hello World” in Python and HTML.

Subject:
Applied Science
Computer Science
Material Type:
Diagram/Illustration
Provider:
Center for History and New Media
Author:
William J. Turkel and Adam Crymble
Date Added:
06/16/2015
The Programming Historian 2: Transliterating non-ASCII characters with Python
Unrestricted Use
CC BY
Rating
0.0 stars

This lesson shows how to use Python to transliterate automatically a list of words from a language with a non-Latin alphabet to a standardized format using the American Standard Code for Information Interchange (ASCII) characters. It builds on readers’ understanding of Python from the lessons “Viewing HTML Files,” “Working with Web Pages,” “From HTML to List of Words (part 1)” and “Intro to Beautiful Soup.” At the end of the lesson, we will use the transliteration dictionary to convert the names from a database of the Russian organization Memorial from Cyrillic into Latin characters. Although the example uses Cyrillic characters, the technique can be reproduced with other alphabets using Unicode.

Subject:
Applied Science
Computer Science
Material Type:
Diagram/Illustration
Provider:
Center for History and New Media
Author:
Seth Bernstein
Date Added:
06/16/2015
The Programming Historian 2: Understanding Regular Expressions
Unrestricted Use
CC BY
Rating
0.0 stars

In this exercise we will use advanced find-and-replace capabilities in a word processing application in order to make use of structure in a brief historical document that is essentially a table in the form of prose. Without using a general programming language, we will gain exposure to some aspects of computational thinking, especially pattern matching, that can be immediately helpful to working historians (and others) using word processors, and can form the basis for subsequent learning with more general programming environments.

Subject:
Applied Science
Computer Science
Material Type:
Diagram/Illustration
Homework/Assignment
Provider:
Center for History and New Media
Author:
Doug Knox
Date Added:
06/16/2015
The Programming Historian 2: Viewing HTML Files
Unrestricted Use
CC BY
Rating
0.0 stars

When you are working with online sources, much of the time you will be using files that have been marked up with HTML (Hyper Text Markup Language). Your browser already knows how to interpret HTML, which is handy for human readers. Most browsers also let you see the HTML source code for any page that you visit. The two images below show a typical web page (from the Old Bailey Online) and the HTML source used to generate that page, which you can see with the Tools -> Web Developer -> Page Source command in Firefox.

Subject:
Arts and Humanities
History
Material Type:
Diagram/Illustration
Provider:
Center for History and New Media
Author:
Adam Crymble
William J. Turkel
Date Added:
06/14/2015
The Programming Historian 2: Working With Web Pages
Unrestricted Use
CC BY
Rating
0.0 stars

This lesson introduces Uniform Resource Locators (URLs) and explains how to use Python to download and save the contents of a web page to your local hard drive.

Subject:
Applied Science
Computer Science
Material Type:
Diagram/Illustration
Provider:
Center for History and New Media
Author:
William J. Turkel and Adam Crymble
Date Added:
06/16/2015
The Programming Historian 2: Working with Text Files
Unrestricted Use
CC BY
Rating
0.0 stars

In this lesson you will learn how to manipulate text files using Python. This includes opening, closing, reading from, and writing to .txt files.

The next few lessons will involve downloading a web page from the Internet and reorganizing the contents into useful chunks of information. You will be doing most of your work using Python code written and executed in Komodo Edit.

Subject:
Applied Science
Computer Science
Material Type:
Diagram/Illustration
Provider:
Center for History and New Media
Author:
William J. Turkel and Adam Crymble
Date Added:
06/16/2015
Programming with MATLAB
Unrestricted Use
CC BY
Rating
0.0 stars

The best way to learn how to program is to do something useful, so this introduction to MATLAB is built around a common scientific task: data analysis. Our real goal isn’t to teach you MATLAB, but to teach you the basic concepts that all programming depends on. We use MATLAB in our lessons because: we have to use something for examples; it’s well-documented; it has a large (and growing) user base among scientists in academia and industry; and it has a large library of packages available for performing diverse tasks. But the two most important things are to use whatever language your colleagues are using, so that you can share your work with them easily, and to use that language well.

Subject:
Applied Science
Computer Science
Information Science
Mathematics
Measurement and Data
Material Type:
Module
Provider:
The Carpentries
Author:
Gerard Capes
Date Added:
03/20/2017
Programming with Python
Unrestricted Use
CC BY
Rating
0.0 stars

The best way to learn how to program is to do something useful, so this introduction to Python is built around a common scientific task: data analysis. Arthritis Inflammation We are studying inflammation in patients who have been given a new treatment for arthritis, and need to analyze the first dozen data sets of their daily inflammation. The data sets are stored in comma-separated values (CSV) format: each row holds information for a single patient, columns represent successive days. The first three rows of our first file look like this: 0,0,1,3,1,2,4,7,8,3,3,3,10,5,7,4,7,7,12,18,6,13,11,11,7,7,4,6,8,8,4,4,5,7,3,4,2,3,0,0 0,1,2,1,2,1,3,2,2,6,10,11,5,9,4,4,7,16,8,6,18,4,12,5,12,7,11,5,11,3,3,5,4,4,5,5,1,1,0,1 0,1,1,3,3,2,6,2,5,9,5,7,4,5,4,15,5,11,9,10,19,14,12,17,7,12,11,7,4,2,10,5,4,2,2,3,2,2,1,1 Each number represents the number of inflammation bouts that a particular patient experienced on a given day. For example, value “6” at row 3 column 7 of the data set above means that the third patient was experiencing inflammation six times on the seventh day of the clinical study. So, we want to: Calculate the average inflammation per day across all patients. Plot the result to discuss and share with colleagues. To do all that, we’ll have to learn a little bit about programming.

Subject:
Applied Science
Computer Science
Information Science
Mathematics
Measurement and Data
Material Type:
Module
Provider:
The Carpentries
Author:
Anne Fouilloux
Lauren Ko
Maxim Belkin
Trevor Bekolay
Valentina Staneva
Date Added:
08/07/2020
Programming with R
Unrestricted Use
CC BY
Rating
0.0 stars

The best way to learn how to program is to do something useful, so this introduction to R is built around a common scientific task: data analysis. Our real goal isn’t to teach you R, but to teach you the basic concepts that all programming depends on. We use R in our lessons because: we have to use something for examples; it’s free, well-documented, and runs almost everywhere; it has a large (and growing) user base among scientists; and it has a large library of external packages available for performing diverse tasks. But the two most important things are to use whatever language your colleagues are using, so you can share your work with them easily, and to use that language well. We are studying inflammation in patients who have been given a new treatment for arthritis, and need to analyze the first dozen data sets of their daily inflammation. The data sets are stored in CSV format (comma-separated values): each row holds information for a single patient, and the columns represent successive days. The first few rows of our first file look like this: 0,0,1,3,1,2,4,7,8,3,3,3,10,5,7,4,7,7,12,18,6,13,11,11,7,7,4,6,8,8,4,4,5,7,3,4,2,3,0,0 0,1,2,1,2,1,3,2,2,6,10,11,5,9,4,4,7,16,8,6,18,4,12,5,12,7,11,5,11,3,3,5,4,4,5,5,1,1,0,1 0,1,1,3,3,2,6,2,5,9,5,7,4,5,4,15,5,11,9,10,19,14,12,17,7,12,11,7,4,2,10,5,4,2,2,3,2,2,1,1 0,0,2,0,4,2,2,1,6,7,10,7,9,13,8,8,15,10,10,7,17,4,4,7,6,15,6,4,9,11,3,5,6,3,3,4,2,3,2,1 0,1,1,3,3,1,3,5,2,4,4,7,6,5,3,10,8,10,6,17,9,14,9,7,13,9,12,6,7,7,9,6,3,2,2,4,2,0,1,1 We want to: load that data into memory, calculate the average inflammation per day across all patients, and plot the result. To do all that, we’ll have to learn a little bit about programming.

Subject:
Applied Science
Computer Science
Information Science
Mathematics
Measurement and Data
Material Type:
Module
Provider:
The Carpentries
Author:
Diya Das
Katrin Leinweber
Rohit Goswami
Date Added:
03/20/2017
Project Organization and Management for Genomics
Unrestricted Use
CC BY
Rating
0.0 stars

Data Carpentry Genomics workshop lesson to learn how to structure your metadata, organize and document your genomics data and bioinformatics workflow, and access data on the NCBI sequence read archive (SRA) database. Good data organization is the foundation of any research project. It not only sets you up well for an analysis, but it also makes it easier to come back to the project later and share with collaborators, including your most important collaborator - future you. Organizing a project that includes sequencing involves many components. There’s the experimental setup and conditions metadata, measurements of experimental parameters, sequencing preparation and sample information, the sequences themselves and the files and workflow of any bioinformatics analysis. So much of the information of a sequencing project is digital, and we need to keep track of our digital records in the same way we have a lab notebook and sample freezer. In this lesson, we’ll go through the project organization and documentation that will make an efficient bioinformatics workflow possible. Not only will this make you a more effective bioinformatics researcher, it also prepares your data and project for publication, as grant agencies and publishers increasingly require this information. In this lesson, we’ll be using data from a study of experimental evolution using E. coli. More information about this dataset is available here. In this study there are several types of files: Spreadsheet data from the experiment that tracks the strains and their phenotype over time Spreadsheet data with information on the samples that were sequenced - the names of the samples, how they were prepared and the sequencing conditions The sequence data Throughout the analysis, we’ll also generate files from the steps in the bioinformatics pipeline and documentation on the tools and parameters that we used. In this lesson you will learn: How to structure your metadata, tabular data and information about the experiment. The metadata is the information about the experiment and the samples you’re sequencing. How to prepare for, understand, organize and store the sequencing data that comes back from the sequencing center How to access and download publicly available data that may need to be used in your bioinformatics analysis The concepts of organizing the files and documenting the workflow of your bioinformatics analysis

Subject:
Business and Communication
Genetics
Life Science
Management
Material Type:
Module
Provider:
The Carpentries
Author:
Amanda Charbonneau
Bérénice Batut
Daniel O. S. Ouso
Deborah Paul
Erin Alison Becker
François Michonneau
Jason Williams
Juan A. Ugalde
Kevin Weitemier
Laura Williams
Paula Andrea Martinez
Peter R. Hoyt
Rayna Michelle Harris
Taylor Reiter
Toby Hodges
Tracy Teal
Date Added:
08/07/2020
Puerto Rican Migration to the US
Unrestricted Use
CC BY
Rating
0.0 stars

This collection uses primary sources to explore Puerto Rican migration to the US. Digital Public Library of America Primary Source Sets are designed to help students develop their critical thinking skills and draw diverse material from libraries, archives, and museums across the United States. Each set includes an overview, ten to fifteen primary sources, links to related resources, and a teaching guide. These sets were created and reviewed by the teachers on the DPLA's Education Advisory Committee.

Subject:
Ethnic Studies
History
Social Science
U.S. History
Material Type:
Primary Source
Provider:
Digital Public Library of America
Provider Set:
Primary Source Sets
Author:
Samantha Gibson
Date Added:
04/11/2016
P values in display items are ubiquitous and almost invariably significant: A survey of top science journals
Unrestricted Use
CC BY
Rating
0.0 stars

P values represent a widely used, but pervasively misunderstood and fiercely contested method of scientific inference. Display items, such as figures and tables, often containing the main results, are an important source of P values. We conducted a survey comparing the overall use of P values and the occurrence of significant P values in display items of a sample of articles in the three top multidisciplinary journals (Nature, Science, PNAS) in 2017 and, respectively, in 1997. We also examined the reporting of multiplicity corrections and its potential influence on the proportion of statistically significant P values. Our findings demonstrated substantial and growing reliance on P values in display items, with increases of 2.5 to 14.5 times in 2017 compared to 1997. The overwhelming majority of P values (94%, 95% confidence interval [CI] 92% to 96%) were statistically significant. Methods to adjust for multiplicity were almost non-existent in 1997, but reported in many articles relying on P values in 2017 (Nature 68%, Science 48%, PNAS 38%). In their absence, almost all reported P values were statistically significant (98%, 95% CI 96% to 99%). Conversely, when any multiplicity corrections were described, 88% (95% CI 82% to 93%) of reported P values were statistically significant. Use of Bayesian methods was scant (2.5%) and rarely (0.7%) articles relied exclusively on Bayesian statistics. Overall, wider appreciation of the need for multiplicity corrections is a welcome evolution, but the rapid growth of reliance on P values and implausibly high rates of reported statistical significance are worrisome.

Subject:
Mathematics
Statistics and Probability
Material Type:
Reading
Provider:
PLOS ONE
Author:
Ioana Alina Cristea
John P. A. Ioannidis
Date Added:
08/07/2020
Python Calculus
Read the Fine Print
Educational Use
Rating
0.0 stars

Students analyze a cartoon of a Rube Goldberg machine and a Python programming language script to practice engineering analysis. In both cases, they study the examples to determine how the different systems operate and the function of each component. This exercise in juxtaposition enables students to see the parallels between a more traditional mechanical engineering design and computer programming. Students also gain practice in analyzing two very different systems to fully understand how they work, similar to how engineers analyze systems and determine how they function and how changes to the system might affect the system.

Subject:
Applied Science
Computing and Information
Education
Engineering
Mathematics
Trigonometry
Material Type:
Lesson Plan
Provider:
TeachEngineering
Provider Set:
TeachEngineering
Author:
Brian Sandall
Scott Burns
Date Added:
09/18/2014
Python Programming for the Humanities -- A Python Course for the Humanities
Unrestricted Use
CC BY
Rating
0.0 stars

The programming language Python is widely used within many scientific domains nowadays and the language is readily accessible to scholars from the Humanities. Python is an excellent choice for dealing with (linguistic as well as literary) textual data, which is so typical of the Humanities. In this book you will be thoroughly introduced to the language and be taught to program basic algorithmic procedures. The book expects no prior experience with programming, although we hope to provide some interesting insights and skills for more advanced programmers as well. The book consists of 10 chapters. Chapter 5 and Chapter 6 are still in draft status and not ready for use.

Subject:
Arts and Humanities
Material Type:
Data Set
Full Course
Primary Source
Textbook
Provider:
DARIAH-DE
Author:
Folgert Karsdorp and Maarten van Gompel
modifications by Mike Kestemont and Lars Wieneke
Date Added:
01/29/2015
Python Script Analysis
Read the Fine Print
Educational Use
Rating
0.0 stars

Working in small groups, students complete and run functioning Python codes. They begin by determining the missing commands in a sample piece of Python code that doubles all the elements of a given input and sums the resulting values. Then students modify more advanced Python code, which numerically computes the slope of a tangent line by finding the slopes of progressively closer secant lines; to this code they add explanatory comments to describe the function of each line of code. This requires students to understand the logic employed in the Python code. Finally, students make modifications to the code in order to find the slopes of tangents to a variety of functions.

Subject:
Applied Science
Computing and Information
Education
Engineering
Mathematics
Trigonometry
Material Type:
Activity/Lab
Provider:
TeachEngineering
Provider Set:
TeachEngineering
Author:
Brian Sandall
Scott Burns
Date Added:
09/18/2014
Python for Harvesting Data on the Web
Conditional Remix & Share Permitted
CC BY-NC
Rating
0.0 stars

This session is an intermediate-to-advanced level class that offers some ideas for how to approach the following common data wrangling needs in research: 1) Obtain data and load it into a suitable data "container" for analysis, often via a web interface, especially an API, 2) parse the data retrieved via an API and turn it into a useful object for manipulation and analysis, and 3) perform some basic summary counts of records in a dataset and work up a quick visualization.

Subject:
Applied Science
Life Science
Physical Science
Social Science
Material Type:
Activity/Lab
Provider:
New York University
Author:
Nick Wolf
Vicky Steeves
Date Added:
01/06/2020
Python for Humanities
Unrestricted Use
CC BY
Rating
0.0 stars

Python is a general purpose programming language that is useful for writing scripts to work effectively and reproducibly with data. This is an introduction to Python designed for participants with no programming experience. These lessons can be taught in a day (~ 6 hours). They start with some basic information about Python syntax, the Jupyter notebook interface, and move through how to import CSV files, using the pandas package to work with data frames, how to calculate summary information from a data frame, and a brief introduction to plotting. The last lesson demonstrates how to work with databases directly from Python.

Subject:
Applied Science
Computer Science
Information Science
Mathematics
Measurement and Data
Material Type:
Module
Provider:
The Carpentries
Author:
Iain Emsley
Date Added:
08/07/2020