Welcome to the Data Science Bootcamp for Beginners

Overview

Are you interested in pursuing a career in data science or looking to enhance your data analysis skills? Our self-contained Data Science for Beginners course is a perfect opportunity to learn the fundamental concepts and skills of data science in a practical, hands-on manner.

This course is designed for beginners who have little to no prior experience in data science. You will learn how to use Python and popular data science libraries such as Pandas, NumPy, and Scikit-learn to clean, preprocess, analyze, and visualize data. You will also learn how to build machine learning models and deploy them.

By the end of the course, you will have the skills and knowledge needed to tackle data analysis and machine learning projects on your own. You will also have completed a capstone project, which will give you hands-on experience in applying the concepts you have learned.

Join the Data Science for Beginners course today to jumpstart your career in data science or take your data analysis skills to the next level!

Registering for this course

Registration for this course is now open! Registration will remain open until 23:59 Anywhere on Earth (AoE) June 23, 2023, providing ample time for interested participants to secure their spot. To register, please visit the application here and complete the online registration form. The process is quick and easy, and you will receive a confirmation email once your registration is complete. Classes for this course will be held on Mondays, Wednesdays and Friday from 18:00 to 20:00 UTC+0 via online meeting link to be provided to registered participants. The course will run for 5 weeks, starting on June 26, 2023, and ending on July 28, 2023. Please note that there are limited seats available, and registrations will be accepted on a first-come, first-served basis.

Don't miss out on this opportunity to kickstart your data science journey! Register now and join us for an engaging and comprehensive course that will equip you with the skills you need to excel in the field of data science. For any inquiries or further information, please contact us at . We look forward to welcoming you to our "Data Science for Beginners" course!

Course Payment

Payment for this course is required at the time of registration to secure your spot in the class. The course fee is $30 (NLE 600), which covers all course materials and resources. Payment can be made through Mobile Money (Orange Money - +232-75-460-610), PayPal or credit card or bank transfer. Please note that the course fee is non-refundable.

Course Prerequisites

The Data Science for Beginners course is designed for individuals who have little to no prior experience in data science. However, there are a few prerequisites that participants should meet to ensure they are prepared for the course:

By meeting these prerequisites, you will be able to keep up with the course material and get the most out of it.

Course Outline

The course is self-contained. It is designed for people with different skill levels. A beginner with no prior data science or programming experience can take this course as well as someone who has basic data science knowledge and wants to advance their career or someone from a different field who wants to change their career trajectory. We will cover all the basics, from the ground up and give you access to the tools and skills necessary in the sexiest career in the 21st century. This is a five week intensive hands-on course that balances theory with practices. We believe these two are intertwined and understanding the mechanics of underlying tools will be a force multiplier that will make you stand out as you pursue your career in this exciting field whether you want to be a business analyst, data engineer, data scientist or an academic researcher etc. We strive to cater for you all.

  • event
    Week 0

    Elementary Math for Data Science

    • Basic Linear Algebra
    • Statistics and Probability Theory
    • Calculus and Optimization
  • event
    Week 1

    Introduction to Data Science

    • What is data science?
    • Overview of the data science workflow
    • Tools and resources for data science
    • Basics of Python programming
    • Introduction to data types, variables, and operators
    • Data structures in Python: lists, tuples, dictionaries, and sets
    • Object Oriented Programming (OOP)
    • I/O in Python
  • event
    Week 2

    Data Manipulation and Analysis

    • Introduction to NumPy and Pandas
    • Loading and manipulating data with Pandas
    • Data cleaning and preprocessing
    • Data visualization with Matplotlib and Seaborn
    • Exploratory data analysis (EDA)
  • event
    Week 3

    Machine Learning Fundamentals

    • Machine learning basics
    • Supervised learning
    • Unsupervised learning
    • Introduction to scikit-learn
    • Feature engineering
    • Building and evaluating machine learning models
  • event
    Week 4

    Capstone Project

    • Participants will work on a capstone project that demonstrates their understanding of the concepts covered in the previous four weeks.
    • The project involves data cleaning, exploratory data analysis, and building a machine learning model.
    • We will provide project ideas and data sets for participants to choose from, or participants can propose their own project ideas.

This course includes lectures, hands-on exercises, and assignments to reinforce the concepts covered. The capstone project is an opportunity for you to apply what you have learned and showcase your skills. By the end of the course, you should have a basic understanding of data science concepts and tools, and be able to analyze and manipulate data, and build simple machine learning models.

Detail Course Schedule

Guidelines for the capstone project

  1. The Capstone project is mandatory and must be completed by each participant to receive a certificate of completion.
  2. Participants must select a dataset relevant to their area of interest, subject to approval by the instructor.
  3. Participants must document their work throughout the project and maintain a clean and well-organized repository on GitHub.
  4. Participants must submit a final report that includes their findings, methodology, and any code or scripts used in the project.
  5. Participants are encouraged to collaborate with other participants but must submit individual projects.
  6. Participants must adhere to ethical guidelines for data collection, use, and analysis, and avoid misrepresenting or misinterpreting results.
  7. Participants must follow good software engineering practices, such as modular code design, version control, and unit testing.
  8. Participants must adhere to the project timeline and deadlines and seek assistance from the instructor or teaching assistants in case of any issues.
  9. The final project must demonstrate proficiency in data cleaning, data analysis, and machine learning model building.
  10. Participants will be evaluated based on the quality of their project, their ability to communicate their findings, and their adherence to the project guidelines.

By following these rules and guidelines, participants will be able to successfully complete their Capstone project and showcase their skills in data science and machine learning.

Datasets and resources to get started

Here are some datasets and resources to get you started on your capstone project:

Target Industries:

Data Science Applications:

Examples

  1. Titanic Dataset: This dataset contains information on the passengers who were aboard the Titanic when it sank. It includes information on age, sex, class, and survival status. This dataset is often used for predicting survival outcomes based on various features https://www.kaggle.com/c/titanic/data
  2. Iris Dataset: This dataset contains information on the petal and sepal lengths and widths of three different species of iris flowers. This dataset is often used for classification tasks and clustering analysis https://archive.ics.uci.edu/ml/datasets/iris
  3. Wine Quality Dataset: This dataset contains information on the physicochemical properties of different wines, along with a quality rating. This dataset is often used for classification and regression tasks https://archive.ics.uci.edu/ml/datasets/wine+quality
  4. Boston Housing Dataset: This dataset contains information on housing prices in Boston and various features that may influence those prices. This dataset is often used for regression analysis https://www.cs.toronto.edu/~delve/data/boston/bostonDetail.html
  5. Breast Cancer Wisconsin (Diagnostic) Dataset: This dataset contains information on various features of breast cancer tumors and a binary classification of whether the tumor is malignant or benign. This dataset is often used for classification tasks https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)
  6. MNIST Handwritten Digits Dataset: This dataset contains a large number of images of handwritten digits, along with labels indicating the digit that each image represents. This dataset is often used for classification tasks, particularly image classification http://yann.lecun.com/exdb/mnist/
  7. California Housing Dataset: This dataset contains information on housing prices in California and various features that may influence those prices. This dataset is often used for regression analysis https://www.kaggle.com/datasets/camnugent/california-housing-prices
  8. Pima Indians Diabetes Dataset: This dataset contains information on various features of Pima Indian women and a binary classification of whether or not they have diabetes. This dataset is often used for classification tasks https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database
  9. Adult Income Dataset: This dataset contains information on various features of individuals and a binary classification of whether their income is above or below $50,000 per year. This dataset is often used for classification tasks https://archive.ics.uci.edu/ml/datasets/adult
  10. Bank Marketing Dataset: This dataset contains information on various features of individuals and whether or not they subscribed to a bank's marketing campaign. This dataset is often used for classification tasks and customer segmentation https://archive.ics.uci.edu/ml/datasets/bank+marketing

These datasets are widely used in beginner data science capstone projects and provide a great starting point for learning various data science concepts and techniques.