Updating search results...

Search Resources

429 Results

View
Selected filters:
  • data
Data Intro for Archivists
Unrestricted Use
CC BY
Rating
0.0 stars

This Library Carpentry lesson introduces archivists to working with data. At the conclusion of the lesson you will: be able to explain terms, phrases, and concepts in code or software development; identify and use best practice in data structures; use regular expressions in searches.

Subject:
Applied Science
Information Science
Mathematics
Measurement and Data
Material Type:
Module
Provider:
The Carpentries
Author:
James Baker
Jeanine Finn
Jenny Bunn
Katherine Koziar
Noah Geraci
Scott Peterson
Date Added:
08/07/2020
Data Is Present: Open Workshops and Hackathons
Unrestricted Use
CC BY
Rating
0.0 stars

Original data has become more accessible thanks to cultural and technological advances. On the internet, we can find innumerable data sets from sources such as scientific journals and repositories, local and national governments, and non-governmental organisations. Often, these data may be presented in novel ways, by creating new tables or plots, or by integrating additional data. Free, open-source software has become a great companion for open data. This open scholarship project offers free workshops and coding meet-ups (hackathons) to learn and practise data presentation, across the UK. It is made possible by a fellowship of the Software Sustainability Institute.

Subject:
Applied Science
Life Science
Physical Science
Social Science
Material Type:
Activity/Lab
Author:
Pablo Bernabeu
Date Added:
01/27/2020
The Data Journalism Handbook
Conditional Remix & Share Permitted
CC BY-SA
Rating
0.0 stars

When you combine the sheer scale and range of digital information now available with a journalist’s "nose for news" and her ability to tell a compelling story, a new world of possibility opens up. With The Data Journalism Handbook, you’ll explore the potential, limits, and applied uses of this new and fascinating field.

This valuable handbook has attracted scores of contributors since the European Journalism Centre and the Open Knowledge Foundation launched the project at MozFest 2011. Through a collection of tips and techniques from leading journalists, professors, software developers, and data analysts, you’ll learn how data can be either the source of data journalism or a tool with which the story is told—or both.

Subject:
Business and Communication
Journalism
Material Type:
Textbook
Provider:
University of Bath
Author:
Jonathan Gray
Liliana Bounegru
Lucy Chambers
Date Added:
07/02/2019
Data Management Planning
Unrestricted Use
Public Domain
Rating
0.0 stars

Data management planning is the starting point in the data life cycle. Creating a formal document that outlines what you will do with the data during and after the completion of research helps to ensure that the data is safe for current and future use. This lesson describes the benefits of a data management plan (DMP), outlines the components of a DMP, details tools for creating a DMP, provides NSF DMP information, and demonstrates the use of an example DMP.

Subject:
Applied Science
Education
Higher Education
Information Science
Material Type:
Lesson
Provider:
DataONE
Author:
DataONE Community Engagement & Outreach Working Group
Date Added:
11/21/2020
Data Management Short Course for Scientists
Read the Fine Print
Educational Use
Rating
0.0 stars

The ESIP Federation, in cooperation with NOAA and the Data Conservancy, seeks to share the community's knowledge with scientists who increasingly need to be better data managers, as well as to support workforce development for new data management professionals. Over the next several years, the ESIP Federation expects to evolve training courses which seeks to improve the understanding of scientific data management among scientists, emerging scientists, and data professionals of all sorts.

All courses are available under a Creative Commons Attribution 3.0 license that allows you to share and adapt the work as long as you cite the work according to the citation provided. Please send feedback upon the courses to shortcourseeditors@esipfed.org.

Subject:
Applied Science
Information Science
Material Type:
Lecture
Module
Primary Source
Author:
Earth Science Information Partners
Date Added:
03/21/2022
Data Management Skillbuilding Hub - DataOne
Unrestricted Use
Public Domain
Rating
0.0 stars

The Data Management Skillbuilding Hub is a repository for open educational resources regarding data management, meaning that it is a collection of learning resources freely contributed by anyone willing to share them. Materials such as lessons, best practices, and videos, are stored in the DataONEorg GitHub repository as well as searchable through the Data Management Training Clearinghouse. We invite you submit your own educational resources so that the Data Management Skillbuilding Hub can remain an up-to-date and sustainable educational tool for all to benefit from. You can easily contribute learning materials to the Skillbuilding Hub via GitHub online.

Subject:
Applied Science
Information Science
Material Type:
Lesson
Primary Source
Provider:
DataONE
Date Added:
03/21/2022
Data Management with SQL for Ecologists
Unrestricted Use
CC BY
Rating
0.0 stars

Databases are useful for both storing and using data effectively. Using a relational database serves several purposes. It keeps your data separate from your analysis. This means there’s no risk of accidentally changing data when you analyze it. If we get new data we can rerun a query to find all the data that meets certain criteria. It’s fast, even for large amounts of data. It improves quality control of data entry (type constraints and use of forms in Access, Filemaker, etc.) The concepts of relational database querying are core to understanding how to do similar things using programming languages such as R or Python. This lesson will teach you what relational databases are, how you can load data into them and how you can query databases to extract just the information that you need.

Subject:
Applied Science
Computer Science
Information Science
Mathematics
Measurement and Data
Material Type:
Module
Provider:
The Carpentries
Author:
Christina Koch
Donal Heidenblad
Katy Felkner
Rémi Rampin
Timothée Poisot
Date Added:
03/20/2017
Data Management with SQL for Social Scientists
Unrestricted Use
CC BY
Rating
0.0 stars

This is an alpha lesson to teach Data Management with SQL for Social Scientists, We welcome and criticism, or error; and will take your feedback into account to improve both the presentation and the content. Databases are useful for both storing and using data effectively. Using a relational database serves several purposes. It keeps your data separate from your analysis. This means there’s no risk of accidentally changing data when you analyze it. If we get new data we can rerun a query to find all the data that meets certain criteria. It’s fast, even for large amounts of data. It improves quality control of data entry (type constraints and use of forms in Access, Filemaker, etc.) The concepts of relational database querying are core to understanding how to do similar things using programming languages such as R or Python. This lesson will teach you what relational databases are, how you can load data into them and how you can query databases to extract just the information that you need.

Subject:
Applied Science
Computer Science
Information Science
Mathematics
Measurement and Data
Social Science
Material Type:
Module
Provider:
The Carpentries
Author:
Peter Smyth
Date Added:
08/07/2020
Data Organization in Spreadsheets for Ecologists
Unrestricted Use
CC BY
Rating
0.0 stars

Good data organization is the foundation of any research project. Most researchers have data in spreadsheets, so it’s the place that many research projects start. We organize data in spreadsheets in the ways that we as humans want to work with the data, but computers require that data be organized in particular ways. In order to use tools that make computation more efficient, such as programming languages like R or Python, we need to structure our data the way that computers need the data. Since this is where most research projects start, this is where we want to start too! In this lesson, you will learn: Good data entry practices - formatting data tables in spreadsheets How to avoid common formatting mistakes Approaches for handling dates in spreadsheets Basic quality control and data manipulation in spreadsheets Exporting data from spreadsheets In this lesson, however, you will not learn about data analysis with spreadsheets. Much of your time as a researcher will be spent in the initial ‘data wrangling’ stage, where you need to organize the data to perform a proper analysis later. It’s not the most fun, but it is necessary. In this lesson you will learn how to think about data organization and some practices for more effective data wrangling. With this approach you can better format current data and plan new data collection so less data wrangling is needed.

Subject:
Applied Science
Computer Science
Information Science
Mathematics
Measurement and Data
Material Type:
Module
Provider:
The Carpentries
Author:
Christie Bahlai
Peter R. Hoyt
Tracy Teal
Date Added:
03/20/2017
Data Organization in Spreadsheets for Social Scientists
Unrestricted Use
CC BY
Rating
0.0 stars

Lesson on spreadsheets for social scientists. Good data organization is the foundation of any research project. Most researchers have data in spreadsheets, so it’s the place that many research projects start. Typically we organize data in spreadsheets in ways that we as humans want to work with the data. However computers require data to be organized in particular ways. In order to use tools that make computation more efficient, such as programming languages like R or Python, we need to structure our data the way that computers need the data. Since this is where most research projects start, this is where we want to start too! In this lesson, you will learn: Good data entry practices - formatting data tables in spreadsheets How to avoid common formatting mistakes Approaches for handling dates in spreadsheets Basic quality control and data manipulation in spreadsheets Exporting data from spreadsheets In this lesson, however, you will not learn about data analysis with spreadsheets. Much of your time as a researcher will be spent in the initial ‘data wrangling’ stage, where you need to organize the data to perform a proper analysis later. It’s not the most fun, but it is necessary. In this lesson you will learn how to think about data organization and some practices for more effective data wrangling. With this approach you can better format current data and plan new data collection so less data wrangling is needed.

Subject:
Applied Science
Information Science
Mathematics
Measurement and Data
Social Science
Material Type:
Module
Provider:
The Carpentries
Author:
David Mawdsley
Erin Becker
François Michonneau
Karen Word
Lachlan Deer
Peter Smyth
Date Added:
08/07/2020
Data Quality Control and Assurance
Unrestricted Use
Public Domain
Rating
0.0 stars

Quality assurance and quality control are phrases used to describe activities that prevent errors from entering or staying in a data set. These activities ensure the quality of the data before it is collected, entered, or analyzed, as well as actively monitoring and maintaining the quality of data throughout the study. In this lesson, we define and provide examples of quality assurance, quality control, data contamination and types of errors that may be found in data sets. After completing this lesson, participants will be able to describe best practices in quality assurance and quality control and relate them to different phases of data collection and entry.

Subject:
Applied Science
Education
Higher Education
Information Science
Material Type:
Lesson
Provider:
DataONE
Author:
DataONE Community Engagement & Outreach Working Group
Date Added:
11/21/2020
Data Sharing
Unrestricted Use
Public Domain
Rating
0.0 stars

When first sharing research data, researchers often raise questions about the value, benefits, and mechanisms for sharing. Many stakeholders and interested parties, such as funding agencies, communities, other researchers, or members of the public may be interested in research, results and related data. This lesson addresses data sharing in the context of the data life cycle, the value of sharing data, concerns about sharing data, and methods and best practices for sharing data.

Subject:
Applied Science
Education
Higher Education
Information Science
Material Type:
Lesson
Provider:
DataONE
Author:
DataONE Community Engagement & Outreach Working Group
Date Added:
11/21/2020
Data Sharing, Mandates, and Repositories
Conditional Remix & Share Permitted
CC BY-NC
Rating
0.0 stars

Some research funders have a mandate for data resulting from their funded research to be shared. This presentation provides a general definition of data sharing and how scholars can identify and follow data sharing mandates.

Subject:
Applied Science
Education
Higher Education
Information Science
Material Type:
Lecture
Author:
Kristy Padron
Date Added:
11/22/2020
Data Sharing by Scientists: Practices and Perceptions
Unrestricted Use
CC BY
Rating
0.0 stars

Background Scientific research in the 21st century is more data intensive and collaborative than in the past. It is important to study the data practices of researchers – data accessibility, discovery, re-use, preservation and, particularly, data sharing. Data sharing is a valuable part of the scientific method allowing for verification of results and extending research from prior results. Methodology/Principal Findings A total of 1329 scientists participated in this survey exploring current data sharing practices and perceptions of the barriers and enablers of data sharing. Scientists do not make their data electronically available to others for various reasons, including insufficient time and lack of funding. Most respondents are satisfied with their current processes for the initial and short-term parts of the data or research lifecycle (collecting their research data; searching for, describing or cataloging, analyzing, and short-term storage of their data) but are not satisfied with long-term data preservation. Many organizations do not provide support to their researchers for data management both in the short- and long-term. If certain conditions are met (such as formal citation and sharing reprints) respondents agree they are willing to share their data. There are also significant differences and approaches in data management practices based on primary funding agency, subject discipline, age, work focus, and world region. Conclusions/Significance Barriers to effective data sharing and preservation are deeply rooted in the practices and culture of the research process as well as the researchers themselves. New mandates for data management plans from NSF and other federal agencies and world-wide attention to the need to share and preserve data could lead to changes. Large scale programs, such as the NSF-sponsored DataNET (including projects like DataONE) will both bring attention and resources to the issue and make it easier for scientists to apply sound data management principles.

Subject:
Ecology
Life Science
Social Science
Material Type:
Reading
Provider:
PLOS ONE
Author:
Arsev Umur Aydinoglu
Carol Tenopir
Eleanor Read
Kimberly Douglass
Lei Wu
Maribeth Manoff
Mike Frame
Suzie Allard
Date Added:
08/07/2020
Data Structures and Algorithms Materials
Unrestricted Use
CC BY
Rating
0.0 stars

Assignments, notes, and exam questions for CS 315: Data Structures and Algorithms. Taught by Raphael Finkel, Department of Computer Science, University of Kentucky.

Subject:
Applied Science
Computer Science
Material Type:
Assessment
Homework/Assignment
Lecture Notes
Provider:
University of Kentucky
Author:
Raphael Finkel
Date Added:
07/16/2024
Data Training Engaging End-users
Conditional Remix & Share Permitted
CC BY-SA
Rating
0.0 stars

Data Tree is a free online course with all you need to know for research data management, along with ways to engage and share data with business, policymakers, media and the wider public. The self-paced training course will take 15 to 20 hours to complete in eight structured modules. The course is packed with video, quizzes and real-life examples of data management, along with valuable tips from experts in data management, data sharing and science communication. The training course materials will be available for structured learning, but also to dip into for immediate problem solving.

Data Tree is funded by the Natural Environment Research Council (NERC) through the National Productivity Investment Fund (NPIF), delivered by the Institute for Environmental Analytics and Stats4SD and supported by the Institute of Physics.

Subject:
Applied Science
Information Science
Material Type:
Module
Primary Source
Date Added:
05/16/2022
Data Wrangling and Processing for Genomics
Unrestricted Use
CC BY
Rating
0.0 stars

Data Carpentry lesson to learn how to use command-line tools to perform quality control, align reads to a reference genome, and identify and visualize between-sample variation. A lot of genomics analysis is done using command-line tools for three reasons: 1) you will often be working with a large number of files, and working through the command-line rather than through a graphical user interface (GUI) allows you to automate repetitive tasks, 2) you will often need more compute power than is available on your personal computer, and connecting to and interacting with remote computers requires a command-line interface, and 3) you will often need to customize your analyses, and command-line tools often enable more customization than the corresponding GUI tools (if in fact a GUI tool even exists). In a previous lesson, you learned how to use the bash shell to interact with your computer through a command line interface. In this lesson, you will be applying this new knowledge to carry out a common genomics workflow - identifying variants among sequencing samples taken from multiple individuals within a population. We will be starting with a set of sequenced reads (.fastq files), performing some quality control steps, aligning those reads to a reference genome, and ending by identifying and visualizing variations among these samples. As you progress through this lesson, keep in mind that, even if you aren’t going to be doing this same workflow in your research, you will be learning some very important lessons about using command-line bioinformatic tools. What you learn here will enable you to use a variety of bioinformatic tools with confidence and greatly enhance your research efficiency and productivity.

Subject:
Applied Science
Computer Science
Genetics
Information Science
Life Science
Mathematics
Measurement and Data
Material Type:
Module
Provider:
The Carpentries
Author:
Adam Thomas
Ahmed R. Hasan
Aniello Infante
Anita Schürch
Dev Paudel
Erin Alison Becker
Fotis Psomopoulos
François Michonneau
Gaius Augustus
Gregg TeHennepe
Jason Williams
Jessica Elizabeth Mizzi
Karen Cranston
Kari L Jordan
Kate Crosby
Kevin Weitemier
Lex Nederbragt
Luis Avila
Peter R. Hoyt
Rayna Michelle Harris
Ryan Peek
Sheldon John McKay
Sheldon McKay
Taylor Reiter
Tessa Pierce
Toby Hodges
Tracy Teal
Vasilis Lenis
Winni Kretzschmar
dbmarchant
Date Added:
08/07/2020