3. Python as a Data-Analysis Tool

Why Python?

  1. Python is and object-oriented programming language. Hence creating classes and objects are easy.
  2. Simple syntax.
  3. Runs on an interpreter system, means that code runs as soon as it is written hence prototyping becomes easier.
  4. Huge collection of standard libraries.
  5. Can easily get integrated with third party tools hence it become cost saving for companies.
  6. Python is open source and has a huge community support.

Introduction to Jupyter Notebook

  • An IDE (Integrated Development Environment): An IDE is a software suite that consolidates the basic tools that developers need to write and test software.
  • Jupyter Notebook: Jupyter Notebook was previously called IPython Notebook.
  • Versatility and Shareability: Jupyter Notebook is versatile and shareable.
  • Data Visualization: It has the ability to display data visualizations in the same window.
  • Open-Source Web Application: Jupyter Notebook is an open-source web application.
  • Notebook: A notebook refers to a collection of codes, documents, and visualizations all in one place.
  • Independent Cells: Jupyter Notebook has independent cells for different parts of the code, allowing execution of specific code sections without running the entire code.

Features of Jupyter Notebook

  • Python3 Notebook Dropdown Menu: In Jupyter Python3 Notebooks, there is a 4-option dropdown menu:
    • Code: This option provides a cell where you can write Python code.
    • Text: This option provides a cell where you can write notes in text format.
    • Raw NB Convert: This option provides a cell for converting the notebook into other formats, such as HTML.
    • Header: This option takes you to the section where you can write a header for the notebook. Note that you need to use a double pound (##) sign to ensure that anything after it is treated as a header.

Data Types in Python

Python Data Types

  • int: Integer, a whole number without decimals, e.g., 5, -10, 42.
  • float: Floating-point number, a number with decimals, e.g., 5.6, -3.14, 2.0.
  • str: String, a sequence of characters, e.g., "Hello", "12345".
  • bool: Boolean, representing True or False values.
  • list: Ordered collection of items, e.g., [1, 2, 3], ["apple", "banana"].
  • tuple: Immutable ordered collection of items, e.g., (1, 2, 3), ("apple", "banana").
  • set: Unordered collection of unique items, e.g., {1, 2, 3}, {"apple", "banana"}.
  • dict: Dictionary, a collection of key-value pairs, e.g., {"name": "John", "age": 25}.
  • NoneType: Represents the absence of a value or a null value, e.g., None.

Python Data Types - Key Points

To check data type of any variable: Use the inbuilt type() function. type(var) will return the type of the variable passed as an argument.

Integers: In Python, there is no limit to how long an int value can be. Integers are represented by the int class.

Floats: Floats can be represented using e or E notation, which is called scientific notation. Floating-point numbers are represented by the float class.

Sequence Data Types: In Python, sequences are ordered collections of similar or different data types. Sequences allow storing multiple values in an organized and efficient manner.

Strings: Strings in Python are arrays of bytes representing UNICODE characters. A string is a collection of one or more characters enclosed in single, double, or triple quotes. Strings are represented by the str class. Individual characters of strings can be accessed by indexing.

Booleans: Booleans are Python data types with two built-in values: True and False. Note that only these two formats are valid for True and False in Python.

Basic Operations

Python Basic Operations - Key Points

Arithmetic Operations: Python supports basic arithmetic operations like addition, subtraction, multiplication, division, and modulus.

  • Addition: a + b
  • Subtraction: a - b
  • Multiplication: a * b
  • Division: a / b
  • Modulus (Remainder): a % b
  • Exponentiation (Power): a ** b
  • Floor Division: a // b

Comparison Operations: Used to compare two values and return a boolean result.

  • Equal to: a == b
  • Not equal to: a != b
  • Greater than: a > b
  • Less than: a < b
  • Greater than or equal to: a >= b
  • Less than or equal to: a <= b

Logical Operations: Used to perform logical operations and return boolean results.

  • AND: a && b
  • OR: a || b
  • NOT: not a

Assignment Operations: Used to assign values to variables.

  • Simple assignment: a = b
  • Add and assign: a += b
  • Subtract and assign: a -= b
  • Multiply and assign: a *= b
  • Divide and assign: a /= b
  • Modulo and assign: a %= b
  • Exponentiate and assign: a **= b
  • Floor divide and assign: a //= b

Membership Operations: Used to check if a value is present in a sequence (e.g., a list, tuple, or string).

  • In: a in b
  • Not in: a not in b

Identity Operations: Used to compare the memory locations of two objects.

  • Is: a is b
  • Is not: a is not b

Condition Statements & Loops

Branching in Python (if-else)

Basic Syntax:

if condition:
                            # Execute if condition is True
                        elif another_condition:
                            # Execute if another_condition is True
                        else:
                            # Execute if no condition is True

Example:

def check_number(num):
                            if num > 0:
                                print("Positive number")
                            elif num < 0:
                                print("Negative number")
                            else:
                                print("Zero")
                        
                        check_number(10)  # Positive number

Key Points:

  • if - checks the condition; executes if True.
  • elif - checks an alternative condition if the first is False.
  • else - executes if all the previous conditions are False.
  • Conditions are evaluated in order. Once a condition is found to be True, no other conditions are checked.

For Loop in Python

Basic Syntax:

for item in iterable:
    # Execute for each item in iterable

Example:

fruits = ["apple", "banana", "cherry"]

for fruit in fruits:
    print(fruit)  # apple, banana, cherry

Key Points:

  • Used to iterate over sequences (lists, tuples, strings, etc.).
  • Can loop over a range using range(start, stop, step).
  • Loops through each element one by one.
--- #### **While Loop in Python** ```html

While Loop in Python

Basic Syntax:

while condition:
    # Execute as long as condition is True

Example:

count = 1
while count <= 5:
    print(count)
    count += 1  # Increment count to avoid infinite loop

Key Points:

  • The loop continues as long as the condition is True.
  • Ensure that the condition will eventually become False to avoid infinite loops.
  • Useful when the number of iterations is not known beforehand.

Functions in Python

Functions in Python

Basic Syntax:

def function_name(parameters):
    # Function body
    # Code to execute
    return result  # Optional, to return a value

Example:

def greet(name):
    return "Hello, " + name + "!"

message = greet("Alice")
print(message)  # Output: Hello, Alice!

Key Points:

  • def is used to define a function in Python.
  • The function can take parameters (input) and return a value (output).
  • Functions can be called multiple times with different arguments.
  • If no return statement is used, the function returns None by default.
  • Functions allow you to organize code, reuse logic, and make the code more modular and readable.
  • You can define default values for function parameters (e.g., def greet(name="Guest")).

Basic Libraries

Python Libraries

Definition:

A library is a collection of functions and methods which allows the user to perform actions without writing lengthy code to achieve a task.

Common Libraries used in Data Analysis:

  • Pandas: Helps in data manipulation and analysis.
  • NumPy: Helps in performing complicated mathematical calculations for large datasets.
  • Matplotlib: Helps in plotting different types of graphs in Python.

Installing Libraries in Conda Environment:

To install a library, use the following command in the Conda environment:

(for Anaconda Environment): conda install <library name>
(for Python Environment): pip install <library name>
NEXT-->