This course has been designed to introduce Unix to students as most convenient tool for working with big data in biological sciences such as next generation sequencing (NGS) data. NGS technologies are producing massive amount of data in each run which is difficult to handle through GUI based tools, even it is difficult to open raw files. That’s why sequencing data are produced and stored in text format for easy handling and processing. Unix skill is an assets for bioinformatics. It is very easy, convenient and save lot of time. Bioinformatics skilled people are knows very well to analyze data with programming language PERL/PYTHON. But all of them not realized that it is not necessary to write program all the time. With the help of unix utilities, data handling and processing, input formatting for software, and easy text processing of results for the understanding can be performed without using high end programming skill and special software’s. But you will need software and programming skills for advance bioinformatics analyses. It is great skill for bio-sciences researchers and scientist and NGS beginners. Unix skills will help you in making of pipelines where you can use different software to solve your own objective such asCounting and formatting of fasta and fastq sequencesMultiple line fasta sequences to single line fasta sequencesExtraction of desired fasta and fastq sequences from whole datasetSplitting and subseting of large sequence fileFormatting of blast, pfam, and interpro output for analysisExtraction of sub sequences from genome filesSequence file cleaning: Triming and filtering of sequencesRandom data set generationBulk data processing for common tasks. and many more common tasksHere, I am intend to cover only specific aspect of unix as required for NGS data processing and project management. Whole course is divided into 4 module from basic command to script. In this course, you will have lot of practice opportunities. In 4 days, you will learn through tutorials, video lectures and assignments for practice. There could be several ways for the teaching and learning, But, i used easiest and simplest approach, and focused to develop thinking for data processing instead of advance and compact use of commands. In guide to practice commands, I have given multiple approach to perform single task. So, you will also have opportunity to use compact and advance options of commands. Day 1 - Introduction to NGS and UNIXCourse introductionBrief description of NGS and UNIX (video).Unix: How to start, basic commands (Directories and files: creation, remove, navigation, listing, writing/retrieval, and unpacking of NGS data files)System information related commands and their usagesQuick revisionPractice assignmentsChallenge of the dayDay 2 NGS bioinformatics data excursionNGS: data source, files and file formats. Unix command for excursionSmart trick to solve complex problemsQuick revisionPractice assignments (with common NGS data processing related tasks)Challenge of the dayDay 3 Flying with commandsFile streaming and redirection, stream editor, pipe, filtersPermission, symbolic linking, construction of pipeline on terminal Practice assignments (with common NGS data processing related tasks)Challenge of the dayDay 4 - Bulk data processingBrief introduction of shell scriptingPattern matching, variables, subshells and loopsPractice Assignments (with common NGS data processing related tasks)Challenge of the day

Unix essentials for NGS bioinformatics

Recommended products