OER Commons

Análisis y visualización de datos usando Python

Unrestricted Use

CC BY

Análisis y visualización de datos usando Python

Rating

Python es un lenguaje de programación general que es útil para escribir scripts para trabajar con datos de manera efectiva y reproducible. Esta es una introducción a Python diseñada para participantes sin experiencia en programación. Estas lecciones pueden enseñarse en un día (~ 6 horas). Las lecciones empiezan con información básica sobre la sintaxis de Python, la interface de Jupyter Notebook, y continúan con cómo importar archivos CSV, usando el paquete Pandas para trabajar con DataFrames, cómo calcular la información resumen de un DataFrame, y una breve introducción en cómo crear visualizaciones. La última lección demuestra cómo trabajar con bases de datos directamente desde Python. Nota: los datos no han sido traducidos de la versión original en inglés, por lo que los nombres de variables se mantienen en inglés y los números de cada observación usan la sintaxis de habla inglesa (coma separador de miles y punto separador de decimales).

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Alejandra Gonzalez-Beltran; April Wright; Christopher Erdmann; Enric Escorsa O'Callaghan; Erin Becker; Fernando Garcia; Hely Salgado; Juan M. Barrios; Juan Martín Barrios; Katrin Leinweber; LUS24; Laura Angelone; Leonardo Ulises Spairani; Maxim Belkin; Miguel González; Nicolás Palopoli; Nohemi Huanca Nunez; Paula Andrea Martinez; Raniere Silva; Rayna Harris; Sarah Brown; Silvana Pereyra; Spencer Harris; Stephan Druskat; Trevor Keller; Wilson Lozano; chekos; monialo2000; rzayas
Date Added:: 08/07/2020

More Less

Unrestricted Use

CC BY

Carpentries Instructor Training

Rating

A two-day introduction to modern evidence-based teaching practices, built and maintained by the Carpentry community.

Subject:: Applied Science; Computer Science; Education; Higher Education; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Aleksandra Nenadic; Alexander Konovalov; Alistair John Walsh; Allison Weber; Amy E. Hodge; Andrew B. Collier; Anita Schürch; AnnaWilliford; Ariel Rokem; Brian Ballsun-Stanton; Callin Switzer; Christian Brueffer; Christina Koch; Christopher Erdmann; Colin Morris; Dan Allan; DanielBrett; Danielle Quinn; Darya Vanichkina; David Jennings; Eric Jankowski; Erin Alison Becker; Evan Peter Williamson; François Michonneau; Gerard Capes; Greg Wilson; Ian Lee; Jason M Gates; Jason Williams; Jeffrey Oliver; Joe Atzberger; John Bradley; John Pellman; Jonah Duckles; Jonathan Bradley; Karen Cranston; Karen Word; Kari L Jordan; Katherine Koziar; Katrin Leinweber; Kees den Heijer; Laurence; Lex Nederbragt; Maneesha Sane; Marie-Helene Burle; Mik Black; Mike Henry; Murray Cadzow; Neal Davis; Neil Kindlon; Nicholas Tierney; Nicolás Palopoli; Noah Spies; Paula Andrea Martinez; Petraea; Rayna Michelle Harris; Rémi Emonet; Rémi Rampin; Sarah Brown; Sarah M Brown; Sarah Stevens; Sean; Serah Anne Njambi Kiburu; Stefan Helfrich; Steve Moss; Stéphane Guillou; Ted Laderas; Tiago M. D. Pereira; Toby Hodges; Tracy Teal; Yo Yehudi; amoskane; davidbenncsiro; naught101; satya-vinay
Date Added:: 08/07/2020

More Less

Unrestricted Use

CC BY

Databases and SQL

Rating

Software Carpentry lesson that teaches how to use databases and SQL In the late 1920s and early 1930s, William Dyer, Frank Pabodie, and Valentina Roerich led expeditions to the Pole of Inaccessibility in the South Pacific, and then onward to Antarctica. Two years ago, their expeditions were found in a storage locker at Miskatonic University. We have scanned and OCR the data they contain, and we now want to store that information in a way that will make search and analysis easy. Three common options for storage are text files, spreadsheets, and databases. Text files are easiest to create, and work well with version control, but then we would have to build search and analysis tools ourselves. Spreadsheets are good for doing simple analyses, but they don’t handle large or complex data sets well. Databases, however, include powerful tools for search and analysis, and can handle large, complex data sets. These lessons will show how to use a database to explore the expeditions’ data.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Amy Brown; Andrew Boughton; Andrew Kubiak; Avishek Kumar; Ben Waugh; Bill Mills; Brian Ballsun-Stanton; Chris Tomlinson; Colleen Fallaw; Dan Michael Heggø; Daniel Suess; Dave Welch; David W Wright; Deborah Gertrude Digges; Donny Winston; Doug Latornell; Erin Alison Becker; Ethan Nelson; Ethan P White; François Michonneau; George Graham; Gerard Capes; Gideon Juve; Greg Wilson; Ioan Vancea; Jake Lever; James Mickley; John Blischak; JohnRMoreau@gmail.com; Jonah Duckles; Jonathan Guyer; Joshua Nahum; Kate Hertweck; Kevin Dyke; Louis Vernon; Luc Small; Luke William Johnston; Maneesha Sane; Mark Stacy; Matthew Collins; Matty Jones; Mike Jackson; Morgan Taschuk; Patrick McCann; Paula Andrea Martinez; Pauline Barmby; Piotr Banaszkiewicz; Raniere Silva; Ray Bell; Rayna Michelle Harris; Rémi Emonet; Rémi Rampin; Seda Arat; Sheldon John McKay; Sheldon McKay; Stephen Davison; Thomas Guignard; Trevor Bekolay; lorra; slimlime
Date Added:: 03/20/2017

More Less

Unrestricted Use

CC BY

El Control de Versiones con Git

Rating

Software Carpentry lección para control de versiones con Git Para ilustrar el poder de Git y GitHub, usaremos la siguiente historia como un ejemplo motivador a través de esta lección. El Hombre Lobo y Drácula han sido contratados por Universal Missions para investigar si es posible enviar su próximo explorador planetario a Marte. Ellos quieren poder trabajar al mismo tiempo en los planes, pero ya han experimentado ciertos problemas anteriormente al hacer algo similar. Si se rotan por turnos entonces cada uno gastará mucho tiempo esperando a que el otro termine, pero si trabajan en sus propias copias e intercambian los cambios por email, las cosas se perderán, se sobreescribirán o se duplicarán. Un colega sugiere utilizar control de versiones para lidiar con el trabajo. El control de versiones es mejor que el intercambio de ficheros por email: Nada se pierde una vez que se incluye bajo control de versiones, a no ser que se haga un esfuerzo sustancial. Como se van guardando todas las versiones precedentes de los ficheros, siempre es posible volver atrás en el tiempo y ver exactamente quién escribió qué en un día en particular, o qué versión de un programa fue utilizada para generar un conjunto de resultados en particular. Como se tienen estos registros de quién hizo qué y en qué momento, es posible saber a quién preguntar si se tiene una pregunta en un momento posterior y, si es necesario, revertir el contenido a una versión anterior, de forma similar a como funciona el comando “deshacer” de los editores de texto. Cuando varias personas colaboran en el mismo proyecto, es posible pasar por alto o sobreescribir de manera accidental los cambios hechos por otra persona. El sistema de control de versiones notifica automáticamente a los usuarios cada vez que hay un conflicto entre el trabajo de una persona y la otra. Los equipos no son los únicos que se benefician del control de versiones: los investigadores independientes se pueden beneficiar en gran medida. Mantener un registro de qué ha cambiado, cuándo y por qué es extremadamente útil para todos los investigadores si alguna vez necesitan retomar el proyecto en un momento posterior (e.g. un año después, cuando se ha desvanecido el recuerdo de los detalles).

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Alejandra Gonzalez-Beltran; Amy Olex; Belinda Weaver; Bradford Condon; Casey Youngflesh; Daisie Huang; Dani Ledezma; Francisco Palm; Garrett Bachant; Heather Nunn; Hely Salgado; Ian Lee; Ivan Gonzalez; James E McClure; Javier Forment; Jimmy O'Donnell; Jonah Duckles; K.E. Koziar; Katherine Koziar; Katrin Leinweber; Kevin Alquicira; Kevin MF; Kurt Glaesemann; LauCIFASIS; Leticia Vega; Lex Nederbragt; Mark Woodbridge; Matias Andina; Matt Critchlow; Mingsheng Zhang; Nelly Sélem; Nima Hejazi; Nohemi Huanca Nunez; Olemis Lang; P. L. Lim; Paula Andrea Martinez; Peace Ossom Williamson; Rayna M Harris; Romualdo Zayas-Lagunas; Sarah Stevens; Saskia Hiltemann; Shirley Alquicira; Silvana Pereyra; Tom Morrell; Valentina Bonetti; Veronica Ikeshoji-Orlati; Veronica Jimenez; butterflyskip; dounia
Date Added:: 08/07/2020

More Less

Unrestricted Use

CC BY

Genomics Workshop Overview

Rating

Workshop overview for the Data Carpentry genomics curriculum. Data Carpentry’s aim is to teach researchers basic concepts, skills, and tools for working with data so that they can get more done in less time, and with less pain. This workshop teaches data management and analysis for genomics research including: best practices for organization of bioinformatics projects and data, use of command-line utilities, use of command-line tools to analyze sequence quality and perform variant calling, and connecting to and using cloud computing. This workshop is designed to be taught over two full days of instruction. Please note that workshop materials for working with Genomics data in R are in “alpha” development. These lessons are available for review and for informal teaching experiences, but are not yet part of The Carpentries’ official lesson offerings. Interested in teaching these materials? We have an onboarding video and accompanying slides available to prepare Instructors to teach these lessons. After watching this video, please contact team@carpentries.org so that we can record your status as an onboarded Instructor. Instructors who have completed onboarding will be given priority status for teaching at centrally-organized Data Carpentry Genomics workshops.

Subject:: Applied Science; Computer Science; Genetics; Information Science; Life Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Amanda Charbonneau; Erin Alison Becker; François Michonneau; Jason Williams; Maneesha Sane; Matthew Kweskin; Muhammad Zohaib Anwar; Murray Cadzow; Paula Andrea Martinez; Taylor Reiter; Tracy Teal
Date Added:: 08/07/2020

More Less

Unrestricted Use

CC BY

Introduction to R for Geospatial Data

Rating

The goal of this lesson is to provide an introduction to R for learners working with geospatial data. It is intended as a pre-requisite for the R for Raster and Vector Data lesson for learners who have no prior experience using R. This lesson can be taught in approximately 4 hours and covers the following topics: Working with R in the RStudio GUI Project management and file organization Importing data into R Introduction to R’s core data types and data structures Manipulation of data frames (tabular data) in R Introduction to visualization Writing data to a file The the R for Raster and Vector Data lesson provides a more in-depth introduction to visualization (focusing on geospatial data), and working with data structures unique to geospatial data.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Anne Fouilloux; Chris Prener; Claudia Engel; David Mawdsley; Erin Becker; François Michonneau; Ido Bar; Jeffrey Oliver; Juan Fung; Katrin Leinweber; Kevin Weitemier; Kok Ben Toh; Lachlan Deer; Marieke Frassl; Matt Clark; Miles McBain; Naupaka Zimmerman; Paula Andrea Martinez; Preethy Nair; Raniere Silva; Rayna Harris; Richard McCosh; Vicken Hillis; butterflyskip
Date Added:: 08/07/2020

More Less

Unrestricted Use

CC BY

La Terminal de Unix

Rating

Software Carpentry lección para la terminal de Unix La terminal de Unix ha existido por más tiempo que la mayoría de sus usuarios. Ha sobrevivido tanto tiempo porque es una herramienta poderosa que permite a las personas hacer cosas complejas con sólo unas pocas teclas. Lo más importante es que ayuda a combinar programas existentes de nuevas maneras y automatizar tareas repetitivas, en vez de estar escribiendo las mismas cosas una y otra vez. El uso del terminal o shell es fundamental para usar muchas otras herramientas poderosas y recursos informáticos (incluidos los supercomputadores o “computación de alto rendimiento”). Esta lección te guiará en el camino hacia el uso eficaz de estos recursos.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Adam Huffman; Alejandra Gonzalez-Beltran; AnaBVA; Andrew Sanchez; Anja Le Blanc; Ashwin Srinath; Brian Ballsun-Stanton; Colin Morris; Dani Ledezma; Dave Bridges; Erin Becker; Francisco Palm; François Michonneau; Gabriel A. Devenyi; Gerard Capes; Giuseppe Profiti; Gordon Rhea; Jake Cowper Szamosi; Jared Flater; Jeff Oliver; Jonah Duckles; Juan M. Barrios; Katrin Leinweber; Kelly L. Rowland; Kevin Alquicira; Kunal Marwaha; LauCIFASIS; Marisa Lim; Martha Robinson; Matias Andina; Michael Zingale; Nicolas Barral; Nohemi Huanca Nunez; Olemis Lang; Otoniel Maya; Paula Andrea Martinez; Raniere Silva; Rayna M Harris; Shirley Alquicira; Silvana Pereyra; Steve Leak; Stéphane Guillou; Thomas Mellan; Veronica Jimenez-Jacinto; William L. Close; Yee Mey; csqrs; sjnair
Date Added:: 08/07/2020

More Less

Project Organization and Management for Genomics

Unrestricted Use

CC BY

Project Organization and Management for Genomics

Rating

Data Carpentry Genomics workshop lesson to learn how to structure your metadata, organize and document your genomics data and bioinformatics workflow, and access data on the NCBI sequence read archive (SRA) database. Good data organization is the foundation of any research project. It not only sets you up well for an analysis, but it also makes it easier to come back to the project later and share with collaborators, including your most important collaborator - future you. Organizing a project that includes sequencing involves many components. There’s the experimental setup and conditions metadata, measurements of experimental parameters, sequencing preparation and sample information, the sequences themselves and the files and workflow of any bioinformatics analysis. So much of the information of a sequencing project is digital, and we need to keep track of our digital records in the same way we have a lab notebook and sample freezer. In this lesson, we’ll go through the project organization and documentation that will make an efficient bioinformatics workflow possible. Not only will this make you a more effective bioinformatics researcher, it also prepares your data and project for publication, as grant agencies and publishers increasingly require this information. In this lesson, we’ll be using data from a study of experimental evolution using E. coli. More information about this dataset is available here. In this study there are several types of files: Spreadsheet data from the experiment that tracks the strains and their phenotype over time Spreadsheet data with information on the samples that were sequenced - the names of the samples, how they were prepared and the sequencing conditions The sequence data Throughout the analysis, we’ll also generate files from the steps in the bioinformatics pipeline and documentation on the tools and parameters that we used. In this lesson you will learn: How to structure your metadata, tabular data and information about the experiment. The metadata is the information about the experiment and the samples you’re sequencing. How to prepare for, understand, organize and store the sequencing data that comes back from the sequencing center How to access and download publicly available data that may need to be used in your bioinformatics analysis The concepts of organizing the files and documenting the workflow of your bioinformatics analysis

Subject:: Business and Communication; Genetics; Life Science; Management
Material Type:: Module
Provider:: The Carpentries
Author:: Amanda Charbonneau; Bérénice Batut; Daniel O. S. Ouso; Deborah Paul; Erin Alison Becker; François Michonneau; Jason Williams; Juan A. Ugalde; Kevin Weitemier; Laura Williams; Paula Andrea Martinez; Peter R. Hoyt; Rayna Michelle Harris; Taylor Reiter; Toby Hodges; Tracy Teal
Date Added:: 08/07/2020

More Less

Unrestricted Use

CC BY

R for Reproducible Scientific Analysis

Rating

This lesson in part of Software Carpentry workshop and teach novice programmers to write modular code and best practices for using R for data analysis. an introduction to R for non-programmers using gapminder data The goal of this lesson is to teach novice programmers to write modular code and best practices for using R for data analysis. R is commonly used in many scientific disciplines for statistical analysis and its array of third-party packages. We find that many scientists who come to Software Carpentry workshops use R and want to learn more. The emphasis of these materials is to give attendees a strong foundation in the fundamentals of R, and to teach best practices for scientific computing: breaking down analyses into modular units, task automation, and encapsulation. Note that this workshop will focus on teaching the fundamentals of the programming language R, and will not teach statistical analysis. The lesson contains more material than can be taught in a day. The instructor notes page has some suggested lesson plans suitable for a one or half day workshop. A variety of third party packages are used throughout this workshop. These are not necessarily the best, nor are they comprehensive, but they are packages we find useful, and have been chosen primarily for their usability.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: Adam H. Sparks; Ahsan Ali Khoja; Amy Lee; Ana Costa Conrado; Andrew Boughton; Andrew Lonsdale; Andrew MacDonald; Andris Jankevics; Andy Teucher; Antonio Berlanga-Taylor; Ashwin Srinath; Ben Bolker; Bill Mills; Bret Beheim; Clare Sloggett; Daniel; Dave Bridges; David J. Harris; David Mawdsley; Dean Attali; Diego Rabatone Oliveira; Drew Tyre; Elise Morrison; Erin Alison Becker; Fernando Mayer; François Michonneau; Giulio Valentino Dalla Riva; Gordon McDonald; Greg Wilson; Harriet Dashnow; Ido Bar; Jaime Ashander; James Balamuta; James Mickley; Jamie McDevitt-Irwin; Jeffrey Arnold; Jeffrey Oliver; John Blischak; Jonah Duckles; Josh Quan; Julia Piaskowski; Kara Woo; Kate Hertweck; Katherine Koziar; Katrin Leinweber; Kellie Ottoboni; Kevin Weitemier; Kiana Ashley West; Kieran Samuk; Kunal Marwaha; Kyriakos Chatzidimitriou; Lachlan Deer; Lex Nederbragt; Liz Ing-Simmons; Lucy Chang; Luke W Johnston; Luke Zappia; Marc Sze; Marie-Helene Burle; Marieke Frassl; Mark Dunning; Martin John Hadley; Mary Donovan; Matt Clark; Melissa Kardish; Mike Jackson; Murray Cadzow; Narayanan Raghupathy; Naupaka Zimmerman; Nelly Sélem; Nicholas Lesniak; Nicholas Potter; Nima Hejazi; Nora Mitchell; Olivia Rata Burge; Paula Andrea Martinez; Pete Bachant; Phil Bouchet; Philipp Boersch-Supan; Piotr Banaszkiewicz; Raniere Silva; Rayna Michelle Harris; Remi Daigle; Research Bazaar; Richard Barnes; Robert Bagchi; Rémi Emonet; Sam Penrose; Sandra Brosda; Sarah Munro; Sasha Lavrentovich; Scott Allen Funkhouser; Scott Ritchie; Sebastien Renaut; Thea Van Rossum; Timothy Eoin Moore; Timothy Rice; Tobin Magle; Trevor Bekolay; Tyler Crawford Kelly; Vicken Hillis; Yuka Takemon; bippuspm; butterflyskip; waiteb5
Date Added:: 03/20/2017

More Less

R para Análisis Científicos Reproducibles

Unrestricted Use

CC BY

R para Análisis Científicos Reproducibles

Rating

Una introducción a R utilizando los datos de Gapminder. El objetivo de esta lección es enseñar a las programadoras principiantes a escribir códigos modulares y adoptar buenas prácticas en el uso de R para el análisis de datos. R nos provee un conjunto de paquetes desarrollados por terceros que se usan comúnmente en diversas disciplinas científicas para el análisis estadístico. Encontramos que muchos científicos que asisten a los talleres de Software Carpentry utilizan R y quieren aprender más. Nuestros materiales son relevantes ya que proporcionan a los asistentes una base sólida en los fundamentos de R y enseñan las mejores prácticas del cómputo científico: desglose del análisis en módulos, automatización tareas y encapsulamiento. Ten en cuenta que este taller se enfoca en los fundamentos del lenguaje de programación R y no en el análisis estadístico. A lo largo de este taller se utilizan una variedad de paquetes desarrolados por terceros, los cuales no son necesariamente los mejores ni se encuentran explicadas todas sus funcionalidades, pero son paquetes que consideramos útiles y han sido elegidos principalmente por su facilidad de uso.

Subject:: Applied Science; Computer Science; Information Science; Mathematics; Measurement and Data
Material Type:: Module
Provider:: The Carpentries
Author:: A. s; Alejandra Gonzalez-Beltran; Ana Beatriz Villaseñor Altamirano; Antonio; AntonioJBT; Belinda Weaver; Claudia Engel; Cynthia Monastirsky; Daniel Beiter; David Mawdsley; David Pérez-Suárez; Erin Becker; EuniceML; François Michonneau; Gordon McDonald; Guillermina Actis; Guillermo Movia; Hely Salgado; Ido Bar; Ivan Ogasawara; Ivonne Lujano; James J Balamuta; Jamie McDevitt-Irwin; Jeff Oliver; Jonah Duckles; Juan M. Barrios; Katrin Leinweber; Kevin Alquicira; Kevin Martínez-Folgar; Laura Angelone; Laura-Gomez; Leticia Vega; Marcela Alfaro Córdoba; Marceline Abadeer; Maria Florencia D'Andrea; Marie-Helene Burle; Marieke Frassl; Matias Andina; Murray Cadzow; Narayanan Raghupathy; Naupaka Zimmerman; Paola Prieto; Paula Andrea Martinez; Raniere Silva; Rayna M Harris; Richard Barnes; Richard McCosh; Romualdo Zayas-Lagunas; Sandra Brosda; Sasha Lavrentovich; Shirley Alquicira Hernandez; Silvana Pereyra; Tobin Magle; Veronica Jimenez; juli arancio; raynamharris; saynomoregrl
Date Added:: 08/07/2020

More Less

Education Standards

Subject Area

Education Level

Material Type

License Types

Content Source

Primary User

Media Format

Educational Use

Language

Providers

10 Results

Search Resources

Education Standards

Subject Area

Education Level

Material Type

License Types

Content Source

Primary User

Media Format

Educational Use

Language

Providers

10 Results