Count number of letters and numbers in a string in python
Dec. 13, 2021, 10:10 a.m.
169count number of letters and numbers in a string in python
x="hello world! 12345"
from collections import Counter
mycounter = Counter(x)
# removing duplicated letters by using set()
# arranging letters by using sort()
for y in sorted(set(x)):
print(y+'='+str(mycounter[y]))
output:
=2
!=1
1=1
2=1
3=1
4=1
5=1
d=1
e=1
h=1
l=3
o=2
r=1
w=1
Switch case in python
Dec. 10, 2021, 9:37 a.m.
47Switch case in python
#switch case in python
def add():
print('added')
def mul():
print('multiple ')
def nodata():
print('funct not found ')
func_dict={
'cond_a':add,
'cond_b':mul
}
cond='cond_a'
func_dict.get(cond,nodata)()
cond='cond_c'
func_dict.get(cond,nodata)()
print('------------method 2---------------')
#method two
def calc(operator,x,y):
return {
'add':lambda:x+y,
'sub':lambda:x-y
}.get(operator,lambda:"operator not found" )()
print(calc('add',1,2))
print(calc('mul',1,2))
Output:
added
funct not found
------------method 2---------------
3
operator not found
Convert given number of days in terms of weeks and days
Nov. 2, 2021, 2:10 a.m.
104convert given number of days in terms of Weeks and Days
#32=4weeks + 4days
#7=1week
#8=1week +1day
#6=6days
def find( number_of_days ):
week = int((number_of_days) /7)
days = (number_of_days) % 7
print(str(week)+' week +'+ str(days)+' day')
find(32)
output:
4weeks + 4days
How remove more than one space in string python
July 14, 2021, 10:37 p.m.
201How to remove more than one space in string python
import re
re.sub(' +', ' ', 'The quick brown fox')
output:
'The quick brown fox'
import re
def spacing_issue(str_title):
temp = str_title
try:
str_title = str_title.strip()
str_title = re.sub(' +,', ',', str_title) # spaces before coma
str_title = re.sub(' +/', '/', str_title) # spaces before forward slash /
str_title = re.sub('/ +', '/', str_title) # spaces after forward slash /
str_title = re.sub(' +:', ':', str_title) # spaces before colon :
str_title = re.sub(':\S+', ': ', str_title) # not a space after colon :
str_title = re.sub(' +', ' ', str_title) # double spacing
str_title = re.sub('\( +', '(', str_title) # spaces before open braces( :
str_title = re.sub(' +\)', ')', str_title) # spaces before closed braces ) :
return str_title
except Exception as e:
print("error", e)
return temp
print(spacing_issue('hello, world'))
output:
hello, world
Random password generator in python
June 27, 2021, 9:56 a.m.
141Random Password Generator in Python
import string as s
from random import *
ch=s.ascii_letters+s.digits+s.punctuation
# print(ch)
password="".join(choice(ch) for x in range(randint(8,16)))
print(password)
output:
U][email protected]
How to create and manipulate sql databases with python
April 18, 2021, 8:27 a.m.
191How to Create and Manipulate SQL Databases with Python
install:
pip install mysql-connector-python
pip install pandas
Importing Libraries
As with every project in Python, the very first thing we want to do is import our libraries
It is best practice to import all the libraries we are going to use at the beginning of the project, so people reading or reviewing our code know roughly what is coming up so there are no surprises.
import mysql.connector
from mysql.connector import Error
import pandas as pd
Connecting to MySQL Server
def create_server_connection(host_name, user_name, user_password):
connection = None
try:
connection = mysql.connector.connect(
host=host_name,
user=user_name,
passwd=user_password
)
print("MySQL Database connection successful")
except Error as err:
print(f"Error: '{err}'")
return connection
Creating a re-usable function for code like this is best practice, so that we can use this again and again with minimum effort. Once this is written once you can re-use it in all of your projects in the future too, so future-you will be grateful!
Let's go through this line by line so we understand what's happening here:
The first line is us naming the function (create_server_connection) and naming the arguments that that function will take (host_name, user_name and user_password).
The next line closes any existing connections so that the server doesn't become confused with multiple open connections.
Next we use a Python try-except block to handle any potential errors. The first part tries to create a connection to the server using the mysql.connector.connect() method using the details specified by the user in the arguments. If this works, the function prints a happy little success message.
The except part of the block prints the error which MySQL Server returns, in the unfortunate circumstance that there is an error.
Finally, if the connection is successful, the function returns a connection object.
connection = create_server_connection("localhost", "root", 'mypassword')
Creating a New Database
Now that we have established a connection, our next step is to create a new database on our server.
In this tutorial we will do this only once, but again we will write this as a re-usable function so we have a nice useful function we can re-use for future projects.
def create_database(connection, query):
cursor = connection.cursor()
try:
cursor.execute(query)
print("Database created successfully")
except Error as err:
print(f"Error: '{err}'")
This function takes two arguments, connection (our connection object) and query (a SQL query which we will write in the next step). It executes the query in the server via the connection.
We use the cursor method on our connection object to create a cursor object (MySQL Connector uses an object-oriented programming paradigm, so there are lots of objects inheriting properties from parent objects).
create_database_query="create database school_db"
create_database(connection,create_database_query)
Connecting to the Database
Now that we have created a database in MySQL Server, we can modify our create_server_connection function to connect directly to this database.
Note that it's possible - common, in fact - to have multiple databases on one MySQL Server, so we want to always and automatically connect to the database we're interested in.
def create_db_connection(host_name, user_name, user_password, db_name):
connection = None
try:
connection = mysql.connector.connect(
host=host_name,
user=user_name,
passwd=user_password,
database=db_name
)
print("MySQL Database connection successful")
except Error as err:
print(f"Error: '{err}'")
return connection
This is the exact same function, but now we take one more argument - the database name - and pass that as an argument to the connect() method.
Creating a Query Execution Function
The final function we're going to create (for now) is an extremely vital one - a query execution function. This is going to take our SQL queries, stored in Python as strings, and pass them to the cursor.execute() method to execute them on the server.
def execute_query(connection, query):
cursor = connection.cursor()
try:
cursor.execute(query)
connection.commit()
print("Query successful")
except Error as err:
print(f"Error: '{err}'")
This function is exactly the same as our create_database function from earlier, except that it uses the connection.commit() method to make sure that the commands detailed in our SQL queries are implemented.
This is going to be our workhorse function, which we will use (alongside create_db_connection) to create tables, establish relationships between those tables, populate the tables with data, and update and delete records in our database.
Creating Tables
Now we're all set to start running SQL commands into our Server and to start building our database. The first thing we want to do is to create the necessary tables.
create_teacher_table = """
CREATE TABLE teacher (
teacher_id INT PRIMARY KEY,
first_name VARCHAR(40) NOT NULL,
last_name VARCHAR(40) NOT NULL,
language_1 VARCHAR(3) NOT NULL,
language_2 VARCHAR(3),
dob DATE,
tax_id INT UNIQUE,
phone_no VARCHAR(20)
);
"""
connection = create_db_connection("localhost", "root", 'mypassword', 'school_db') # Connect to the Database
execute_query(connection, create_teacher_table) # Execute our defined query
Now let's create the remaining tables.
create_client_table = """
CREATE TABLE client (
client_id INT PRIMARY KEY,
client_name VARCHAR(40) NOT NULL,
address VARCHAR(60) NOT NULL,
industry VARCHAR(20)
);
"""
create_participant_table = """
CREATE TABLE participant (
participant_id INT PRIMARY KEY,
first_name VARCHAR(40) NOT NULL,
last_name VARCHAR(40) NOT NULL,
phone_no VARCHAR(20),
client INT
);
"""
create_course_table = """
CREATE TABLE course (
course_id INT PRIMARY KEY,
course_name VARCHAR(40) NOT NULL,
language VARCHAR(3) NOT NULL,
level VARCHAR(2),
course_length_weeks INT,
start_date DATE,
in_school BOOLEAN,
teacher INT,
client INT
);
"""
connection = create_db_connection("localhost", "root", 'mypassword', 'school_db')
execute_query(connection, create_client_table)
execute_query(connection, create_participant_table)
execute_query(connection, create_course_table)
This creates the four tables necessary for our four entities.
Now we want to define the relationships between them and create one more table to handle the many-to-many relationship between the participant and course tables (see here for more details).
We do this in exactly the same way
alter_participant = """
ALTER TABLE participant
ADD FOREIGN KEY(client)
REFERENCES client(client_id)
ON DELETE SET NULL;
"""
alter_course = """
ALTER TABLE course
ADD FOREIGN KEY(teacher)
REFERENCES teacher(teacher_id)
ON DELETE SET NULL;
"""
alter_course_again = """
ALTER TABLE course
ADD FOREIGN KEY(client)
REFERENCES client(client_id)
ON DELETE SET NULL;
"""
create_takescourse_table = """
CREATE TABLE takes_course (
participant_id INT,
course_id INT,
PRIMARY KEY(participant_id, course_id),
FOREIGN KEY(participant_id) REFERENCES participant(participant_id) ON DELETE CASCADE,
FOREIGN KEY(course_id) REFERENCES course(course_id) ON DELETE CASCADE
);
"""
connection = create_db_connection("localhost", "root", 'mypassword', 'school_db')
execute_query(connection, alter_participant)
execute_query(connection, alter_course)
execute_query(connection, alter_course_again)
execute_query(connection, create_takescourse_table)
Now our tables are created, along with the appropriate constraints, primary key, and foreign key relations.
Inserting Into Tables
The next step is to add some records to the tables. Again we use execute_query to feed our existing SQL commands into the Server. Let's again start with the Teacher table.
insert_teacher = """
INSERT INTO teacher VALUES
(1, 'James', 'Smith', 'ENG', NULL, '1985-04-20', 12345, '+491774553676'),
(2, 'Stefanie', 'Martin', 'FRA', NULL, '1970-02-17', 23456, '+491234567890'),
(3, 'Steve', 'Wang', 'MAN', 'ENG', '1990-11-12', 34567, '+447840921333'),
(4, 'Friederike', 'Müller-Rossi', 'DEU', 'ITA', '1987-07-07', 45678, '+492345678901'),
(5, 'Isobel', 'Ivanova', 'RUS', 'ENG', '1963-05-30', 56789, '+491772635467'),
(6, 'Niamh', 'Murphy', 'ENG', 'IRI', '1995-09-08', 67890, '+491231231232');
"""
connection = create_db_connection("localhost", "root", 'mypassword', 'school_db')
execute_query(connection, insert_teacher)
lets insert remaing data to tables
insert_client = """
INSERT INTO client VALUES
(101, 'Big Business Federation', '123 Falschungstraße, 10999 Berlin', 'NGO'),
(102, 'eCommerce GmbH', '27 Ersatz Allee, 10317 Berlin', 'Retail'),
(103, 'AutoMaker AG', '20 Künstlichstraße, 10023 Berlin', 'Auto'),
(104, 'Banko Bank', '12 Betrugstraße, 12345 Berlin', 'Banking'),
(105, 'WeMoveIt GmbH', '138 Arglistweg, 10065 Berlin', 'Logistics');
"""
insert_participant = """
INSERT INTO participant VALUES
(101, 'Marina', 'Berg','491635558182', 101),
(102, 'Andrea', 'Duerr', '49159555740', 101),
(103, 'Philipp', 'Probst', '49155555692', 102),
(104, 'René', 'Brandt', '4916355546', 102),
(105, 'Susanne', 'Shuster', '49155555779', 102),
(106, 'Christian', 'Schreiner', '49162555375', 101),
(107, 'Harry', 'Kim', '49177555633', 101),
(108, 'Jan', 'Nowak', '49151555824', 101),
(109, 'Pablo', 'Garcia', '49162555176', 101),
(110, 'Melanie', 'Dreschler', '49151555527', 103),
(111, 'Dieter', 'Durr', '49178555311', 103),
(112, 'Max', 'Mustermann', '49152555195', 104),
(113, 'Maxine', 'Mustermann', '49177555355', 104),
(114, 'Heiko', 'Fleischer', '49155555581', 105);
"""
insert_course = """
INSERT INTO course VALUES
(12, 'English for Logistics', 'ENG', 'A1', 10, '2020-02-01', TRUE, 1, 105),
(13, 'Beginner English', 'ENG', 'A2', 40, '2019-11-12', FALSE, 6, 101),
(14, 'Intermediate English', 'ENG', 'B2', 40, '2019-11-12', FALSE, 6, 101),
(15, 'Advanced English', 'ENG', 'C1', 40, '2019-11-12', FALSE, 6, 101),
(16, 'Mandarin für Autoindustrie', 'MAN', 'B1', 15, '2020-01-15', TRUE, 3, 103),
(17, 'Français intermédiaire', 'FRA', 'B1', 18, '2020-04-03', FALSE, 2, 101),
(18, 'Deutsch für Anfänger', 'DEU', 'A2', 8, '2020-02-14', TRUE, 4, 102),
(19, 'Intermediate English', 'ENG', 'B2', 10, '2020-03-29', FALSE, 1, 104),
(20, 'Fortgeschrittenes Russisch', 'RUS', 'C1', 4, '2020-04-08', FALSE, 5, 103);
"""
insert_takescourse = """
INSERT INTO takes_course VALUES
(101, 15),
(101, 17),
(102, 17),
(103, 18),
(104, 18),
(105, 18),
(106, 13),
(107, 13),
(108, 13),
(109, 14),
(109, 15),
(110, 16),
(110, 20),
(111, 16),
(114, 12),
(112, 19),
(113, 19);
"""
connection = create_db_connection("localhost", "root", 'mypassword', 'school_db')
execute_query(connection, insert_client)
execute_query(connection, insert_participant)
execute_query(connection, insert_course)
execute_query(connection, insert_takescourse)
Reading Data
Now we have a functional database to work with. As a Data Analyst, you are likely to come into contact with existing databases in the organisations where you work. It will be very useful to know how to pull data out of those databases so it can then be fed into your python data pipeline. This is what we are going to work on next.
For this, we will need one more function, this time using cursor.fetchall() instead of cursor.commit(). With this function, we are reading data from the database and will not be making any changes.
def read_query(connection, query):
cursor = connection.cursor()
result = None
try:
cursor.execute(query)
result = cursor.fetchall()
return result
except Error as err:
print(f"Error: '{err}'")
Again, we are going to implement this in a very similar way to execute_query. Let's try it out with a simple query to see how it works.
q1 = """
SELECT *
FROM teacher;
"""
connection = create_db_connection("localhost", "root", pw, db)
results = read_query(connection, q1)
for result in results:
print(result)
lets select data by using joins
q2 = """
SELECT course.course_id, course.course_name, course.language, client.client_name, client.address
FROM course
JOIN client
ON course.client = client.client_id
WHERE course.in_school = FALSE;
"""
connection = create_db_connection("localhost", "root", 'mypassword', 'school_db')
results = read_query(connection, q2)
for result in results:
print(result)
Formatting Output into a List
#Initialise empty list
from_db = []
# Loop over the results and append them into our list
# Returns a list of tuples
for result in results:
result = result
from_db.append(result)
print(from_db)
Formatting Output into a List of Lists
# Returns a list of lists
from_db = []
for result in results:
result = list(result)
from_db.append(result)
print(from_db)
Formatting Output into a pandas DataFrame
For Data Analysts using Python, pandas is our beautiful and trusted old friend. It's very simple to convert the output from our database into a DataFrame, and from there the possibilities are endless!
# Returns a list of lists and then creates a pandas DataFrame
from_db = []
for result in results:
result = list(result)
from_db.append(result)
columns = ["course_id", "course_name", "language", "client_name", "address"]
df = pd.DataFrame(from_db, columns=columns)
display(df)
print(df)
Updating Records
When we are maintaining a database, we will sometimes need to make changes to existing records. In this section we are going to look at how to do that.
Let's say the ILS is notified that one of its existing clients, the Big Business Federation, is moving offices to 23 Fingiertweg, 14534 Berlin. In this case, the database administrator (that's us!) will need to make some changes.
Thankfully, we can do this with our execute_query function alongside the SQL UPDATE statement.
update = """
UPDATE client
SET address = '23 Fingiertweg, 14534 Berlin'
WHERE client_id = 101;
"""
connection = create_db_connection("localhost", "root", 'mypassword', 'school_db')
execute_query(connection, update)
Deleting Records
It is also possible use our execute_query function to delete records, by using DELETE.
When using SQL with relational databases, we need to be careful using the DELETE operator. This isn't Windows, there is no 'Are you sure you want to delete this?' warning pop-up, and there is no recycling bin. Once we delete something, it's really gone.
delete_course = """
DELETE FROM course
WHERE course_id = 20;
"""
connection = create_db_connection("localhost", "root", 'mypassword', 'school_db')
execute_query(connection, delete_course)
Creating Records from Lists
We saw when populating our tables that we can use the SQL INSERT command in our execute_query function to insert records into our database.
Given that we're using Python to manipulate our SQL database, it would be useful to be able to take a Python data structure (such as a list) and insert that directly into our database.
This could be useful when we want to store logs of user activity on a social media app we have written in Python, or input from users into a Wiki we have built, for example. There are as many possible uses for this as you can think of.
This method is also more secure if our database is open to our users at any point, as it helps to prevent against SQL Injection attacks, which can damage or even destroy our whole database.
To do this, we will write a function using the executemany() method, instead of the simpler execute() method we have been using thus far.
def execute_list_query(connection, sql, val):
cursor = connection.cursor()
try:
cursor.executemany(sql, val)
connection.commit()
print("Query successful")
except Error as err:
print(f"Error: '{err}'")
Now we have the function, we need to define an SQL command ('sql') and a list containing the values we wish to enter into the database ('val'). The values must be stored as a list of tuples, which is a fairly common way to store data in Python.
To add two new teachers to the database, we can write some code like this:
sql = '''
INSERT INTO teacher (teacher_id, first_name, last_name, language_1, language_2, dob, tax_id, phone_no)
VALUES (%s, %s, %s, %s, %s, %s, %s, %s)
'''
val = [
(7, 'Hank', 'Dodson', 'ENG', None, '1991-12-23', 11111, '+491772345678'),
(8, 'Sue', 'Perkins', 'MAN', 'ENG', '1976-02-02', 22222, '+491443456432')
]
connection = create_db_connection("localhost", "root", 'mypassword', 'school_db')
execute_list_query(connection, sql, val)
How to generate s3 presigned urls using python
Feb. 19, 2021, 4:25 a.m.
214How To Generate S3 PreSigned Urls Using Python
import os
import logging
import boto3
from botocore.client import Config
from botocore.exceptions import ClientError
# python > 3 should be installed
# pip install boto3
# s3v4
# (Default) Signature Version 4
# v4 algorithm starts with X-Amz-Algorithm
#
# s3
# (Deprecated) Signature Version 2, this only works in some regions new regions not supported
# if you have to generate signed url that has > 7 days expiry then use version 2 if your region supports it
s3_signature ={
'v4':'s3v4',
'v2':'s3'
}
# Below is optional you do not need these additional variables as boto3 supports
# reading values from env variables. This is for illustration purpose
AWS_ACCESS_KEY_ID = os.getenv('AWS_ACCESS_KEY_ID')
AWS_SECRET_ACCESS_KEY = os.getenv('AWS_SECRET_ACCESS_KEY')
AWS_DEFAULT_REGION = os.getenv('AWS_DEFAULT_REGION')
def create_presigned_url(bucket_name, bucket_key, expiration=3600, signature_version=s3_signature['v4']):
"""Generate a presigned URL for the S3 object
:param bucket_name: string
:param bucket_key: string
:param expiration: Time in seconds for the presigned URL to remain valid
:param signature_version: string
:return: Presigned URL as string. If error, returns None.
"""
s3_client = boto3.client('s3',
aws_access_key_id=AWS_ACCESS_KEY_ID,
aws_secret_access_key=AWS_SECRET_ACCESS_KEY,
config=Config(signature_version=signature_version),
region_name=AWS_DEFAULT_REGION
)
try:
response = s3_client.generate_presigned_url('get_object',
Params={'Bucket': bucket_name,
'Key': bucket_key},
ExpiresIn=expiration)
print(s3_client.list_buckets()['Owner'])
for key in s3_client.list_objects(Bucket=bucket_name, Prefix=bucket_key)['Contents']:
print(key['Key'])
except ClientError as e:
logging.error(e)
return None
# The response contains the presigned URL
return response
weeks = 8
seven_days_as_seconds = 604800
generated_signed_url = create_presigned_url('djangosimplified', 'downloads/whitepaper.pdf', seven_days_as_seconds, s3_signature['v4'])
print(generated_signed_url)
Creating a simple currency converter in python by using api
Jan. 25, 2021, 1:20 a.m.
155creating a simple currency converter in python by using api
Rates API is a free service for current and historical foreign exchange rates built on top of data published by European Central Bank. Rates API is compatible with any application and programming languages.
install:
pip install requests-html
getting base url
base_url="https://api.exchangeratesapi.io/latest"
import requests
response=requests.get(base_url)
investigateing response
response.ok
response.status_code
response.text
response.content
handling json
response.json()
type(response.json())
import json
json.dumps(response.json(),indent=4)
print(json.dumps(response.json(),indent=4))
response.json().keys()
parameters in the get request
param_url=base_url+"?symbols=USD,GBP"
param_url
response=requests.get(param_url)
response
data=response.json()
data
data['base']
data['date']
data['rates']
param_url=base_url+"?symbols=GBP"+"&"+"base=USD"
param_url
data=requests.get(param_url).json()
data
usd_to_gbp=data['rates']['GBP']
usd_to_gbp
obtainting historical exchange rates
base_url="https://api.exchangeratesapi.io"
base_url
historical_url=base_url+"/2016-01-26"
historical_url
response=requests.get(historical_url)
response.status_code
data=response.json()
print(json.dumps(data,indent=4))
extracting data for a time period
time_period=base_url+'/history'+'?start_at=2017-04-26&end_at=2018-04-26'+'&symbols=GBP'
time_period
data=requests.get(time_period).json()
data
print(json.dumps(data,indent=4,sort_keys=True))
testing the api response to incorrect input
invalid_url=base_url+'/2019-13-05'
invalid_url
response=requests.get(invalid_url)
response
response=requests.get(invalid_url)
response.status_code
response.json()
creating a simple currency converter
date=input('pls enter date format YYYY-MM-DD or type latest :')
base=input('currency convert from :')
curr=input('currency convert to :')
quantity=float(input('how much do u want to convert: {}'.format(base)))
url="https://api.exchangeratesapi.io/"+date+'?base='+base+'&symbols='+curr
response=requests.get(url)
response
if response.ok is False:
print('error {}'.format(response.status_code))
print(response.json()['error'])
else :
data=response.json()
rate=data['rates'][curr]
result=str(quantity*rate)
print('\n {} {} is equal to {:.4} {}, based on exchage rate is {}'.format(quantity,base,result,curr,data['date']))
output:
pls enter date format YYYY-MM-DD or type latest :latest
currency convert from :USD
currency convert to :INR
how much do u want to convert: USD1
1.0 USD is equal to 73.0 INR, based on exchage rate is 2021-01-22
Internet speed with plotly and matplotlib in python
Jan. 16, 2021, 10:31 a.m.
137Internet speed with plotly and matplotlib in python
install:
pip install matplotlib
pip install speedtest-cli
pip install plotly
So by creating a new instance of speedtest as s and testing the upload and download speed we are given the upload and download speed in bits per second. To convert this to megabits per second (Mb/s) we can do the following to include the time of the test too:
import speedtest
import datetime
import time
s = speedtest.Speedtest()
while True:
time_now = datetime.datetime.now().strftime("%H:%M:%S")
downspeed = round((round(s.download()) / 1048576), 2)
upspeed = round((round(s.upload()) / 1048576), 2)
print(f"time: {time_now}, downspeed: {downspeed} Mb/s, upspeed: {upspeed} Mb/s")
# 60 seconds sleep
time.sleep(60)
output:
time: 12:44:15, downspeed: 95.04 Mb/s, upspeed: 32.85 Mb/s
time: 12:44:35, downspeed: 99.46 Mb/s, upspeed: 38.76 Mb/s
time: 12:44:56, downspeed: 100.59 Mb/s, upspeed: 38.94 Mb/s
Now we will move on to recording this in a CSV file. CSVs are large text files which values separated by commas
In order to record into a csv file in python we need to import the CSV package and ‘open’ a CSV file (if one doesn’t exist it will create one).
mynet_speed.py
import speedtest
import datetime
import csv
import time
s = speedtest.Speedtest()
with open('test.csv', mode='w') as speedcsv:
csv_writer = csv.DictWriter(speedcsv, fieldnames=['time', 'downspeed', 'upspeed'])
csv_writer.writeheader()
while True:
time_now = datetime.datetime.now().strftime("%H:%M:%S")
downspeed = round((round(s.download()) / 1048576), 2)
upspeed = round((round(s.upload()) / 1048576), 2)
csv_writer.writerow({
'time': time_now,
'downspeed': downspeed,
"upspeed": upspeed
})
# 60 seconds sleep
time.sleep(60)
So while you let this code run for 4-5 minutes we can discuss what is going on. Line 7 with open essentiallly creates a csv file with the name test.csv with the headers name, downspeed and upspeed and writes them into the csv. Then the loop begins and every time a test is performed by speedtest it writes a new row into the csv with the time, download speed and upload speed we specified before. So let’s go and look at that now.
time,downspeed,upspeed
12:51:16,99.29,38.66
12:51:37,100.67,38.79
12:51:57,99.7,38.79
12:52:17,92.89,31.99
12:52:38,99.4,38.96
Let’s make another python file to generate the graph of our internet connection. This is where we will use matplotlib.
my_net_graph.py
import matplotlib.pyplot as plt
import csv
import matplotlib.ticker as ticker
times = []
download = []
upload = []
with open('test.csv', 'r') as csvfile:
plots = csv.reader(csvfile, delimiter=',')
next(csvfile)
res = [ele for ele in plots if ele != []]
for row in res:
times.append(str(row[0]))
download.append(float(row[1]))
upload.append(float(row[2]))
print(times, "\n", download, "\n", upload)
output:
['12:51:16', '12:51:37', '12:51:57', '12:52:17', '12:52:38']
[99.29, 100.67, 99.7, 92.89, 99.4]
[38.66, 38.79, 38.79, 31.99, 38.96]
So now we are parsing our data! The next(csvfile)
essentially skips the row of headers (that were for our benefit only, not python’s). Now we come on to using matplotlib
which I am by no standards an expert on. Their documentation is extensive.
plt.figure(30)
plt.plot(times, download, label='download', color='r')
plt.plot(times, upload, label='upload', color='b')
plt.xlabel('time')
plt.ylabel('speed in Mb/s')
plt.title("internet speed")
plt.legend()
plt.savefig('test_graph.jpg', bbox_inches='tight')
for matplotlib:
import matplotlib.pyplot as plt
import csv
import matplotlib.ticker as ticker
times = []
download = []
upload = []
with open('test.csv', 'r') as csvfile:
plots = csv.reader(csvfile, delimiter=',')
next(csvfile)
res = [ele for ele in plots if ele != []]
for row in res:
times.append(str(row[0]))
download.append(float(row[1]))
upload.append(float(row[2]))
print(times, "\n", download, "\n", upload)
plt.figure(30)
plt.plot(times, download, label='download', color='r')
plt.plot(times, upload, label='upload', color='b')
plt.xlabel('time')
plt.ylabel('speed in Mb/s')
plt.title("internet speed")
plt.legend()
plt.savefig('test_graph.jpg', bbox_inches='tight')
For plotly:
import plotly
import plotly.graph_objs as go
import csv
times = []
download = []
upload = []
with open('test.csv', 'r') as csvfile:
plots = csv.reader(csvfile, delimiter=',')
next(csvfile)
res = [ele for ele in plots if ele != []]
for row in res:
times.append(str(row[0]))
download.append(float(row[1]))
upload.append(float(row[2]))
print(times, "\n", download, "\n", upload)
# Create traces
trace0 = go.Scatter(
x = times,
y = download,
mode = 'lines+markers',
name = 'Download'
)
trace1 = go.Scatter(
x = times,
y = upload,
mode = 'lines+markers',
name = 'Upload'
)
data = [trace0, trace1]
plotly.offline.plot(data, filename='scatter-mode')
How to extract tables from image in python
Jan. 14, 2021, 9:27 p.m.
420How to extract tables from image in python
install:
pip install opencv-python
pip install pytesseract
pip install openpyxl
Pytesseract : “TesseractNotFound Error: tesseract is not installed or it's not in your path”, how do I fix this?
for Windows:
1. Install tesseract using windows installer available at: https://github.com/UB-Mannheim/tesseract/wiki
2. Note the tesseract path from the installation. Default installation path at the time of this edit was: C:\Users\USER\AppData\Local\Tesseract-OCR
. It may change so please check the installation path.
3. pip install pytesseract
4. Set the tesseract path in the script before calling image_to_string
:
pytesseract.pytesseract.tesseract_cmd = r'C:\Users\USER\AppData\Local\Tesseract-OCR\tesseract.exe'
If you are using Ubuntu install tesseract using following command:
sudo apt-get install tesseract-ocr
For mac:
brew install tesseract
On Linux
sudo apt-get update
sudo apt-get install libleptonica-dev
sudo apt-get install tesseract-ocr tesseract-ocr-dev
sudo apt-get install libtesseract-dev
Then you should install python package using pip:
pip install tesseract
pip install tesseract-ocr
import cv2
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import csv
try:
from PIL import Image
except ImportError:
import Image
import pytesseract
The first step is to read in your file from the proper path, using thresholding to convert the input image to a binary image and inverting it to get a black background and white lines and fonts.
#read your file
file=r'/Users/YOURPATH/testcv.png'
img = cv2.imread(file,0)
img.shape
#thresholding the image to a binary image
thresh,img_bin = cv2.threshold(img,128,255,cv2.THRESH_BINARY |cv2.THRESH_OTSU)
#inverting the image
img_bin = 255-img_bin
cv2.imwrite('/Users/YOURPATH/cv_inverted.png',img_bin)
#Plotting the image to see the output
plotting = plt.imshow(img_bin,cmap='gray')
plt.show()
The next step is to define a kernel to detect rectangular boxes, and followingly the tabular structure. First, we define the length of the kernel and following the vertical and horizontal kernels to detect later on all vertical lines and all horizontal lines.
# Length(width) of kernel as 100th of total width
kernel_len = np.array(img).shape[1]//100
# Defining a vertical kernel to detect all vertical lines of image
ver_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1, kernel_len))
# Defining a horizontal kernel to detect all horizontal lines of image
hor_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernel_len, 1))
# A kernel of 2x2
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2, 2))
The next step is the detection of the vertical lines.
#Use vertical kernel to detect and save the vertical lines in a jpg
image_1 = cv2.erode(img_bin, ver_kernel, iterations=3)
vertical_lines = cv2.dilate(image_1, ver_kernel, iterations=3)
cv2.imwrite("/Users/YOURPATH/vertical.jpg",vertical_lines)
#Plot the generated image
plotting = plt.imshow(image_1,cmap='gray')
plt.show()
And now the same for all horizontal lines.
#Use horizontal kernel to detect and save the horizontal lines in a jpg
image_2 = cv2.erode(img_bin, hor_kernel, iterations=3)
horizontal_lines = cv2.dilate(image_2, hor_kernel, iterations=3)
cv2.imwrite("/Users/YOURPATH/horizontal.jpg",horizontal_lines)
#Plot the generated image
plotting = plt.imshow(image_2,cmap='gray')
plt.show()
We combine the horizontal and vertical lines to a third image, by weighting both with 0.5. The aim is to get a clear tabular structure to detect each cell.
# Combine horizontal and vertical lines in a new third image, with both having same weight.
img_vh = cv2.addWeighted(vertical_lines, 0.5, horizontal_lines, 0.5, 0.0)
#Eroding and thesholding the image
img_vh = cv2.erode(~img_vh, kernel, iterations=2)
thresh, img_vh = cv2.threshold(img_vh,128,255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
cv2.imwrite("/Users/YOURPATH/img_vh.jpg", img_vh)
bitxor = cv2.bitwise_xor(img,img_vh)
bitnot = cv2.bitwise_not(bitxor)
#Plotting the generated image
plotting = plt.imshow(bitnot,cmap='gray')
plt.show()
After having the tabular structure we use the findContours function to detect the contours. This helps us to retrieve the exact coordinates of each box.
# Detect contours for following box detection
contours, hierarchy = cv2.findContours(img_vh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
The following function is necessary to get a sequence of the contours and to sort them from top-to-bottom (https://www.pyimagesearch.com/2015/04/20/sorting-contours-using-python-and-opencv/).
def sort_contours(cnts, method="left-to-right"):
# initialize the reverse flag and sort index
reverse = False
i = 0
# handle if we need to sort in reverse
if method == "right-to-left" or method == "bottom-to-top":
reverse = True
# handle if we are sorting against the y-coordinate rather than
# the x-coordinate of the bounding box
if method == "top-to-bottom" or method == "bottom-to-top":
i = 1
# construct the list of bounding boxes and sort them from top to
# bottom
boundingBoxes = [cv2.boundingRect(c) for c in cnts]
(cnts, boundingBoxes) = zip(*sorted(zip(cnts, boundingBoxes),
key=lambda b:b[1][i], reverse=reverse))
# return the list of sorted contours and bounding boxes
return (cnts, boundingBoxes)
# Sort all the contours by top to bottom.
contours, boundingBoxes = sort_contours(contours, method="top-to-bottom")
How to retrieve the cells position
The further steps are necessary to define the right location, which means proper column and row, of each cell. First, we need to retrieve the height for each cell and store it in the list heights. Then we take the mean from the heights.
#Creating a list of heights for all detected boxes
heights = [boundingBoxes[i][3] for i in range(len(boundingBoxes))]
#Get mean of heights
mean = np.mean(heights)
Next we retrieve the position, width and height of each contour and store it in the box list. Then we draw rectangles around all our boxes and plot the image. In my case I only did it for boxes smaller then a width of 1000 px and a height of 500 px to neglect rectangles which might be no cells, e.g. the table as a whole. These two values depend on your image size, so in case your image is a lot smaller or bigger you need to adjust both.
#Create list box to store all boxes in
box = []
# Get position (x,y), width and height for every contour and show the contour on image
for c in contours:
x, y, w, h = cv2.boundingRect(c)
if (w<1000 and h<500):
image = cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)
box.append([x,y,w,h])
plotting = plt.imshow(image,cmap=’gray’)
plt.show()
Now as we have every cell, its location, height and width we need to get the right location within the table. Therefore, we need to know in which row and which column it is located. As long as a box does not differ more than its own (height + mean/2) the box is in the same row. As soon as the height difference is higher than the current (height + mean/2) , we know that a new row starts. Columns are logically arranged from left to right.
#Creating two lists to define row and column in which cell is located
row=[]
column=[]
j=0
#Sorting the boxes to their respective row and column
for i in range(len(box)):
if(i==0):
column.append(box[i])
previous=box[i]
else:
if(box[i][1]<=previous[1]+mean/2):
column.append(box[i])
previous=box[i]
if(i==len(box)-1):
row.append(column)
else:
row.append(column)
column=[]
previous = box[i]
column.append(box[i])
print(column)
print(row)
Next we calculate the maximum number of columns (meaning cells) to understand how many columns our final dataframe/table will have.
#calculating maximum number of cells
countcol = 0
for i in range(len(row)):
countcol = len(row[i])
if countcol > countcol:
countcol = countcol
After having the maximum number of cells we store the midpoint of each column in a list, create an array and sort the values.
#Retrieving the center of each column
center = [int(row[i][j][0]+row[i][j][2]/2) for j in range(len(row[i])) if row[0]]
center=np.array(center)
center.sort()
At this point, we have all boxes and their values, but as you might see in the output of your row list the values are not always sorted in the right order. That’s what we do next regarding the distance to the columns center. The proper sequence we store in the list finalboxes
#Regarding the distance to the columns center, the boxes are arranged in respective order
finalboxes = []
for i in range(len(row)):
lis=[]
for k in range(countcol):
lis.append([])
for j in range(len(row[i])):
diff = abs(center-(row[i][j][0]+row[i][j][2]/4))
minimum = min(diff)
indexing = list(diff).index(minimum)
lis[indexing].append(row[i][j])
finalboxes.append(lis)
Let’s extract the values
In the next step we make use of our list finalboxes. We take every image-based box, prepare it for Optical Character Recognition by dilating and eroding it and let pytesseract recognize the containing strings. The loop runs over every cell and stores the value in the outer list.
#from every single image-based cell/box the strings are extracted via pytesseract and stored in a list
outer=[]
for i in range(len(finalboxes)):
for j in range(len(finalboxes[i])):
inner=’’
if(len(finalboxes[i][j])==0):
outer.append(' ')
else:
for k in range(len(finalboxes[i][j])):
y,x,w,h = finalboxes[i][j][k][0],finalboxes[i][j][k][1], finalboxes[i][j][k][2],finalboxes[i][j][k][3]
finalimg = bitnot[x:x+h, y:y+w]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2, 1))
border = cv2.copyMakeBorder(finalimg,2,2,2,2, cv2.BORDER_CONSTANT,value=[255,255])
resizing = cv2.resize(border, None, fx=2, fy=2, interpolation=cv2.INTER_CUBIC)
dilation = cv2.dilate(resizing, kernel,iterations=1)
erosion = cv2.erode(dilation, kernel,iterations=1)
pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe' #for windows only
out = pytesseract.image_to_string(erosion)
if(len(out)==0):
out = pytesseract.image_to_string(erosion, config='--psm 3')
inner = inner +" "+ out
outer.append(inner)
The last step is the conversion of the list to a dataframe and storing it into an excel-file.
#Creating a dataframe of the generated OCR list
arr = np.array(outer)
dataframe = pd.DataFrame(arr.reshape(len(row),countcol))
print(dataframe)
data = dataframe.style.set_properties(align="left")
#Converting it in a excel-file
data.to_excel(“/Users/YOURPATH/output.xlsx”)
Thats it.
Finall code aftet combineing all.
import cv2
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import csv
try:
from PIL import Image
except ImportError:
import Image
import pytesseract
#read your file
file=r'/Users/kwikl3arn/Desktop/roseflower.png'
img = cv2.imread(file,0)
img.shape
#thresholding the image to a binary image
thresh,img_bin = cv2.threshold(img,128,255,cv2.THRESH_BINARY | cv2.THRESH_OTSU)
#inverting the image
img_bin = 255-img_bin
cv2.imwrite('/Users/kwikl3arn/Desktop/cv_inverted.png',img_bin)
#Plotting the image to see the output
plotting = plt.imshow(img_bin,cmap='gray')
plt.show()
# countcol(width) of kernel as 100th of total width
kernel_len = np.array(img).shape[1]//100
# Defining a vertical kernel to detect all vertical lines of image
ver_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1, kernel_len))
# Defining a horizontal kernel to detect all horizontal lines of image
hor_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernel_len, 1))
# A kernel of 2x2
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2, 2))
#Use vertical kernel to detect and save the vertical lines in a jpg
image_1 = cv2.erode(img_bin, ver_kernel, iterations=3)
vertical_lines = cv2.dilate(image_1, ver_kernel, iterations=3)
cv2.imwrite("/Users/kwikl3arn/Desktop/vertical.jpg",vertical_lines)
#Plot the generated image
plotting = plt.imshow(image_1,cmap='gray')
plt.show()
#Use horizontal kernel to detect and save the horizontal lines in a jpg
image_2 = cv2.erode(img_bin, hor_kernel, iterations=3)
horizontal_lines = cv2.dilate(image_2, hor_kernel, iterations=3)
cv2.imwrite("/Users/kwikl3arn/Desktop/horizontal.jpg",horizontal_lines)
#Plot the generated image
plotting = plt.imshow(image_2,cmap='gray')
plt.show()
# Combine horizontal and vertical lines in a new third image, with both having same weight.
img_vh = cv2.addWeighted(vertical_lines, 0.5, horizontal_lines, 0.5, 0.0)
#Eroding and thesholding the image
img_vh = cv2.erode(~img_vh, kernel, iterations=2)
thresh, img_vh = cv2.threshold(img_vh,128,255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
cv2.imwrite("/Users/kwikl3arn/Desktop/img_vh.jpg", img_vh)
bitxor = cv2.bitwise_xor(img,img_vh)
bitnot = cv2.bitwise_not(bitxor)
#Plotting the generated image
plotting = plt.imshow(bitnot,cmap='gray')
plt.show()
# Detect contours for following box detection
contours, hierarchy = cv2.findContours(img_vh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
def sort_contours(cnts, method="left-to-right"):
# initialize the reverse flag and sort index
reverse = False
i = 0
# handle if we need to sort in reverse
if method == "right-to-left" or method == "bottom-to-top":
reverse = True
# handle if we are sorting against the y-coordinate rather than
# the x-coordinate of the bounding box
if method == "top-to-bottom" or method == "bottom-to-top":
i = 1
# construct the list of bounding boxes and sort them from top to
# bottom
boundingBoxes = [cv2.boundingRect(c) for c in cnts]
(cnts, boundingBoxes) = zip(*sorted(zip(cnts, boundingBoxes),
key=lambda b:b[1][i], reverse=reverse))
# return the list of sorted contours and bounding boxes
return (cnts, boundingBoxes)
# Sort all the contours by top to bottom.
contours, boundingBoxes = sort_contours(contours, method="top-to-bottom")
#Creating a list of heights for all detected boxes
heights = [boundingBoxes[i][3] for i in range(len(boundingBoxes))]
#Get mean of heights
mean = np.mean(heights)
#Create list box to store all boxes in
box = []
# Get position (x,y), width and height for every contour and show the contour on image
for c in contours:
x, y, w, h = cv2.boundingRect(c)
if (w<1000 and h<500):
image = cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)
box.append([x,y,w,h])
plotting = plt.imshow(image,cmap='gray')
plt.show()
#Creating two lists to define row and column in which cell is located
row=[]
column=[]
j=0
#Sorting the boxes to their respective row and column
for i in range(len(box)):
if(i==0):
column.append(box[i])
previous=box[i]
else:
if(box[i][1]<=previous[1]+mean/2):
column.append(box[i])
previous=box[i]
if(i==len(box)-1):
row.append(column)
else:
row.append(column)
column=[]
previous = box[i]
column.append(box[i])
print(column)
print(row)
#calculating maximum number of cells
countcol = 0
for i in range(len(row)):
countcol = len(row[i])
if countcol > countcol:
countcol = countcol
#Retrieving the center of each column
center = [int(row[i][j][0]+row[i][j][2]/2) for j in range(len(row[i])) if row[0]]
center=np.array(center)
center.sort()
print(center)
#Regarding the distance to the columns center, the boxes are arranged in respective order
finalboxes = []
for i in range(len(row)):
lis=[]
for k in range(countcol):
lis.append([])
for j in range(len(row[i])):
diff = abs(center-(row[i][j][0]+row[i][j][2]/4))
minimum = min(diff)
indexing = list(diff).index(minimum)
lis[indexing].append(row[i][j])
finalboxes.append(lis)
#from every single image-based cell/box the strings are extracted via pytesseract and stored in a list
outer=[]
for i in range(len(finalboxes)):
for j in range(len(finalboxes[i])):
inner=''
if(len(finalboxes[i][j])==0):
outer.append(' ')
else:
for k in range(len(finalboxes[i][j])):
y,x,w,h = finalboxes[i][j][k][0],finalboxes[i][j][k][1], finalboxes[i][j][k][2],finalboxes[i][j][k][3]
finalimg = bitnot[x:x+h, y:y+w]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2, 1))
border = cv2.copyMakeBorder(finalimg,2,2,2,2, cv2.BORDER_CONSTANT,value=[255,255])
resizing = cv2.resize(border, None, fx=2, fy=2, interpolation=cv2.INTER_CUBIC)
dilation = cv2.dilate(resizing, kernel,iterations=1)
erosion = cv2.erode(dilation, kernel,iterations=2)
pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe' #for windows only
out = pytesseract.image_to_string(erosion)
if(len(out)==0):
out = pytesseract.image_to_string(erosion, config='--psm 3')
inner = inner +" "+ out
outer.append(inner)
#Creating a dataframe of the generated OCR list
arr = np.array(outer)
dataframe = pd.DataFrame(arr.reshape(len(row), countcol))
print(dataframe)
data = dataframe.style.set_properties(align="left")
#Converting it in a excel-file
data.to_excel("/Users/kwikl3arn/Desktop/output.xlsx")
Counting repeated characters in a string in python
Nov. 7, 2020, 1:45 a.m.
260Counting repeated characters in a string in Python
check_string = "i am checking this string to see how many times each character appears"
count = {}
for s in check_string:
if s in count:
count[s] += 1
else:
count[s] = 1
for key in count:
if count[key] > 1:
print (key+'=', count[key])
Output:-
i= 5
= 12
a= 7
m= 3
c= 5
h= 5
e= 7
n= 3
g= 2
t= 5
s= 5
r= 4
o= 2
p= 2
Another method;-
from collections import Counter
string = "ihavesometextbutidontmindsharing"
print(Counter(string))
Output:
{'i': 4, 't': 4, 'e': 3, 'n': 3, 's': 2, 'h': 2, 'm': 2, 'o': 2, 'a': 2, 'd': 2, 'x': 1, 'r': 1, 'u': 1, 'b': 1, 'v': 1, 'g': 1}
Cron job using python
Aug. 27, 2020, 6:08 p.m.
272cron job using python
Cron is a system daemon used to execute desired tasks (in the background) at designated times. A crontab is a simple text file with a list of commands meant to be run at specified time. These commands and their run times are then controlled by cron daemon, which executes them in the system background. Each user has a crontab file which specifies actions and times at which they should be executed, these jobs will run regardless of whether user is actually logged into the system or not. There is also a root crontab for tasks requiring administrative privileges. This system crontab allows scheduling of systemwide tasks such as log rotations and system database updates.
Usually we intend to handle cron daemon in a controlled way. One use case is when we just want to supply a command and set a cronjob without editing file manually. A python library python-crontab provides a simple and effective way to access a crontab from python utils, allowing programmer to load cron jobs as objects, search for them and save manipulations.
Installation
The package can be installed directly using pip. Make sure you do not wrongly install crontab from pypi.
pip install python-crontab
Crontab Syntax
Cron uses a specific syntax to define the time schedules. It consists of five fields, which are separated by white spaces. The fields are:
Minute Hour Day Month Day_of_the_Week
The fields can have the following values:
┌───────────── minute (0 - 59)
│ ┌───────────── hour (0 - 23)
│ │ ┌───────────── day of month (1 - 31)
│ │ │ ┌───────────── month (1 - 12)
│ │ │ │ ┌───────────── day of week (0 - 6) (Sunday to Saturday;
│ │ │ │ │ 7 is also Sunday on some systems)
│ │ │ │ │
│ │ │ │ │
* * * * * command to execute
Source: Wikipedia. Cron. Available at https://en.wikipedia.org/wiki/Cron
Cron also acccepts special characters so you can create more complex time schedules. The special characters have the following meanings:
Character | Meaning |
---|---|
Comma | To separate multiple values |
Hyphen | To indicate a range of values |
Asterisk | To indicate all possible values |
Forward slash | To indicate EVERY |
Let's see some examples:
* * * * *
means: every minute of every hour of every day of the month for every month for every day of the week.0 16 1,10,22 * *
tells cron to run a task at 4 PM (which is the 16th hour) on the 1st, 10th and 22nd day of every month.
Getting Access to Crontab
According to the crontab help page, there are five ways to include a job in cron. Of them, three work on Linux only, and two can also be used on Windows.
The first way to access cron is by using the username. The syntax is as follows:
cron = CronTab(user='username')
The other two Linux ways are:
cron = CronTab()
# or
cron = CronTab(user=True)
There are two more syntaxes that will also work on Windows.
In the first one, we call a task defined in the file "filename.tab":
cron = CronTab(tabfile='filename.tab')
In the second one, we define the task according to cron's syntax:
cron = CronTab(tab="""* * * * * command""")
How can I get the current user's username in Bash?
for linux
On the command line, enter
whoami
or
echo "$USER"
which command in Linux
which command in Linux is a command which is used to locate the executable file associated with the given command by searching it in the path environment variable
For example, to find the full path of the ping command , you would type the following:
which ping
The output will be something like this:
/bin/ping
How to print the current working directory
To print the current working directory run the pwd
command. The full path of the current working directory will be printed to standard output.
pwd
/home/dilip
Creating a New Job
Once we have accessed cron, we can create a new task by using the following command:
cron.new(command='my command')
Here, my command
defines the task to be executed via the command line.
We can also add a comment to our task. The syntax is as follows:
cron.new(command='my command', comment='my comment')
Let's see this in an example:
from crontab import CronTab
cron = CronTab(user='username')
job = cron.new(command='python example1.py')
job.minute.every(1)
cron.write()
In the above code we have first accessed cron via the username, and then created a job that consists of running a Python script named example1.py. In addition, we have set the task to be run every 1 minute. The write()
function adds our job to cron.
Setting Restrictions
One of the main advantages of using Python's crontab module is that we can set up time restrictions without having to use cron's syntax.
In the example above, we have already seen how to set running the job every minute. The syntax is as follows:
job.minute.every(minutes)
Similarly we could set up the hours:
job.hour.every(hours)
We can also set up the task to be run on certain days of the week. For example:
job.dow.on('SUN')
The above code will tell cron to run the task on Sundays, and the following code will tell cron to schedule the task on Sundays and Fridays:
job.dow.on('SUN', 'FRI')
Similarly, we can tell cron to run the task in specific months. For example:
job.month.during('APR', 'NOV')
This will tell cron to run the program in the months of April and November.
An important thing to consider is that each time we set a time restriction, we nullify the previous one. Thus, for example:
job.hour.every(5)
job.hour.every(7)
The above code will set the final schedule to run every seven hours, cancelling the previous schedule of five hours.
Unless, we append a schedule to a previous one, like this:
job.hour.every(15)
job.hour.also.on(3)
This will set the schedule as every 15 hours, and at 3 AM.
The 'every' condition can be a bit confusing at times. If we write job.hour.every(15)
, this will be equivalent to * */15 * * *
. As we can see, the minutes have not been modified.
If we want to set the minutes field to zero, we can use the following syntax:
job.every(15).hours()
This will set the schedule to 0 */4 * * *
. Similarly for the 'day of the month', 'month' and 'day of the week' fields.
Examples:
job.every(2).month
is equivalent to0 0 0 */2 *
andjob.month.every(2)
is equivalent to* * * */2 *
job.every(2).dows
is equivalent to0 0 * * */2
andjob.dows.every(2)
is equivalent to* * * * */2
We can see the differences in the following example:
from crontab import CronTab
cron = CronTab(user='username')
job1 = cron.new(command='python example1.py')
job1.hour.every(2)
job2 = cron.new(command='python example1.py')
job2.every(2).hours()
for item in cron:
print item
cron.write()
After running the program, the result is as follows:
$ python cron2.py
* */2 * * * python /home/eca/cron/example1.py
0 */2 * * * python /home/eca/cron/example1.py
the program has set the second task's minutes to zero, and defined the first task minutes' to its default value.
Finally, we can set the task to be run every time we boot our machine. The syntax is as follows:
job.every_reboot()
Clearing Restrictions
We can clear all task's restrictions with the following command:
job.clear()
The following code shows how to use the above command:
from crontab import CronTab
cron = CronTab(user='username')
job = cron.new(command='python example1.py', comment='comment')
job.minute.every(5)
for item in cron:
print item
job.clear()
for item in cron:
print item
cron.write()
After running the code we get the following result:
$ python cron3.py
*/5 * * * * python /home/eca/cron/example1.py # comment
* * * * * python /home/eca/cron/example1.py # comment
the schedule has changed from every 5 minutes to the default setting.
Enabling and Disabling a Job
A task can be enabled or disabled using the following commands:
To enable a job:
job.enable()
To disable a job:
job.enable(False)
In order to verify whether a task is enabled or disabled, we can use the following command:
job.is_enabled()
The following example shows how to enable and disable a previously created job, and verify both states:
from crontab import CronTab
cron = CronTab(user='username')
job = cron.new(command='python example1.py', comment='comment')
job.minute.every(1)
cron.write()
print job.enable()
print job.enable(False)
The result is as follows:
$ python cron4.py
True
False
Checking Validity
We can easily check whether a task is valid or not with the following command:
job.is_valid()
The following example shows how to use this command:
from crontab import CronTab
cron = CronTab(user='username')
job = cron.new(command='python example1.py', comment='comment')
job.minute.every(1)
cron.write()
print job.is_valid()
After running the above program, we obtain the validation, as seen in the following figure:
$ python cron5.py
True
Listing All Cron Jobs
All cron jobs, including disabled jobs can be listed with the following code:
for job in cron:
print job
Adding those lines of code to our first example will show our task by printing on the screen the following:
$ python cron6.py
* * * * * python /home/eca/cron/example1.py
Finding a Job
The Python crontab module also allows us to search for tasks based on a selection criterion, which can be based on a command, a comment, or a scheduled time. The syntaxes are different for each case.
Find according to command:
cron.find_command("command name")
Here 'command name' can be a sub-match or a regular expression.
Find according to comment:
cron.find_comment("comment")
Find according to time:
cron.find_time(time schedule)
The following example shows how to find a previously defined task, according to the three criteria previously mentioned:
from crontab import CronTab
cron = CronTab(user='username')
job = cron.new(command='python example1.py', comment='comment')
job.minute.every(1)
cron.write()
iter1 = cron.find_command('exam')
iter2 = cron.find_comment('comment')
iter3 = cron.find_time("*/1 * * * *")
for item1 in iter1:
print item1
for item2 in iter2:
print item2
for item3 in iter3:
print item3
The result is the listing of the same job three times:
$ python cron7.py
* * * * * python /home/eca/cron/example1.py # comment
* * * * * python /home/eca/cron/example1.py # comment
* * * * * python /home/eca/cron/example1.py # comment
As you can see, it correctly finds the cron command each time.
Removing Jobs
Each job can be removed separately. The syntax is as follows:
cron.remove(job)
The following code shows how to remove a task that was previously created. The program first creates the task. Then, it lists all tasks, showing the one just created. After this, it removes the task, and shows the resulting empty list.
from crontab import CronTab
cron = CronTab(user='username')
job = cron.new(command='python example1.py')
job.minute.every(1)
cron.write()
print "Job created"
# list all cron jobs (including disabled ones)
for job in cron:
print job
cron.remove(job)
print "Job removed"
# list all cron jobs (including disabled ones)
for job in cron:
print job
The result is as follows:
$ python cron8.py
Job created
* * * * * python /home/eca/cron/example1.py
Job removed
Jobs can also be removed based on a condition. For example:
cron.remove_all(comment='my comment')
This will remove all jobs where comment='my comment'
.
Clearing All Jobs
All cron jobs can be removed at once by using the following command:
cron.remove_all()
The following example will remove all cron jobs and show an empty list.
from crontab import CronTab
cron = CronTab(user='username')
cron.remove_all()
# list all cron jobs (including disabled ones)
for job in cron:
print job
Environmental Variables
We can also define environmental variables specific to our scheduled task and show them on the screen. The variables are saved in a dictionary. The syntax to define a new environmental variable is as follows:
job.env['VARIABLE_NAME'] = 'Value'
If we want to get the values for all the environmental variables, we can use the following syntax:
job.env
The example below defines two new environmental variables for the task 'user', and shows their value on the screen. The code is as follows:
from crontab import CronTab
cron = CronTab(user='username')
job = cron.new(command='python example1.py')
job.minute.every(1)
job.env['MY_ENV1'] = 'A'
job.env['MY_ENV2'] = 'B'
cron.write()
print job.env
After running the above program, we get the following result:
$ python cron9.py
MY_ENV1=A
MY_ENV2=B
In addition, Cron-level environment variables are stored in 'cron.env'.
Log Functionality
The log functionality will read a cron log backwards to find you last run instances of your crontab and cron jobs.
The crontab will limit returned entries to user the crontab is for.
cron = CronTab(user=’root’)
for d in cron.log:
print d[‘pid’] + ” – ” + d[‘date’]
Each job can return a log iterator too, these are filtered so you can see when the last execution was.
for d in cron.find_command(‘echo’)[0].log:
print d[‘pid’] + ” – ” + d[‘date’]
Schedule Functionality
If you have croniter python module installed, you will have access to a schedule on each job. For example if you want to know when a job will next run:
schedule = job.schedule(date_from=datetime.now())
This creates a schedule croniter based on the job from time specified. The default date_from is current date/time if not specified. Next we can get datetime of the next job:
datetime = schedule.get_next()
Or the previous:
datetime = schedule.get_prev()
The get methods work in the same way as default croniter, except that they will return datetime objects by default instead of floats. If you want the original functionality, pass float into method when calling:
datetime = schedule.get_current(float)
If you don’t have croniter module installed, you’ll get an ImportError when you first try using schedule function on your cron job object.
CronManager Example
import argparse
import os ,sys
import logging
from crontab import CronTab
"""
Task Scheduler
==========
This module manages periodic tasks using cron.
"""
class CronManager:
def __init__(self):
self.cron = CronTab(user=True)
def add_minutely(self, name, user, command, environment=None):
"""
Add an hourly cron task
"""
cron_job = self.cron.new(command=command, user=user)
cron_job.minute.every(2)
cron_job.enable()
self.cron.write()
if self.cron.render():
print self.cron.render()
return True
def add_hourly(self, name, user, command, environment=None):
"""
Add an hourly cron task
"""
cron_job = self.cron.new(command=command, user=user)
cron_job.minute.on(0)
cron_job.hour.during(0,23)
cron_job.enable()
self.cron.write()
if self.cron.render():
print self.cron.render()
return True
def add_daily(self, name, user, command, environment=None):
"""
Add a daily cron task
"""
cron_job = self.cron.new(command=command, user=user)
cron_job.minute.on(0)
cron_job.hour.on(0)
cron_job.enable()
self.cron.write()
if self.cron.render():
print self.cron.render()
return True
def add_weekly(self, name, user, command, environment=None):
"""
Add a weekly cron task
"""
cron_job = self.cron.new(command=command)
cron_job.minute.on(0)
cron_job.hour.on(0)
cron_job.dow.on(1)
cron_job.enable()
self.cron.write()
if self.cron.render():
print self.cron.render()
return True
def add_monthly(self, name, user, command, environment=None):
"""
Add a monthly cron task
"""
cron_job = self.cron.new(command=command)
cron_job.minute.on(0)
cron_job.hour.on(0)
cron_job.day.on(1)
cron_job.month.during(1,12)
cron_job.enable()
self.cron.write()
if self.cron.render():
print self.cron.render()
return True
def add_quarterly(self, name, user, command, environment=None):
"""
Add a quarterly cron task
"""
cron_job = self.cron.new(command=command)
cron_job.minute.on(0)
cron_job.hour.on(0)
cron_job.day.on(1)
cron_job.month.on(3,6,9,12)
cron_job.enable()
self.cron.write()
if self.cron.render():
print self.cron.render()
return True
def add_anually(self, name, user, command, environment=None):
"""
Add a yearly cron task
"""
cron_job = self.cron.new(command=command)
cron_job.minute.on(0)
cron_job.hour.on(0)
cron_job.month.on(12)
cron_job.enable()
self.cron.write()
if self.cron.render():
print self.cron.render()
return True
Split urls into components in python
Aug. 27, 2020, 5:37 p.m.
255Split URLs into Components in python
The urllib.parse
module provides functions for manipulating URLs and their component parts, to either break them down or build them up.
Parsing
The return value from the urlparse()
function is a ParseResult
object that acts like a tuple
with six elements.
urllib_parse_urlparse.py
from urllib.parse import urlparse
url = 'http://netloc/path;param?query=arg#frag'
parsed = urlparse(url)
print(parsed)
The parts of the URL available through the tuple interface are the scheme, network location, path, path segment parameters (separated from the path by a semicolon), query, and fragment.
RUN:
python3 urllib_parse_urlparse.py
output:
ParseResult(scheme='http', netloc='netloc', path='/path',
params='param', query='query=arg', fragment='frag')
Although the return value acts like a tuple, it is really based on a namedtuple
, a subclass of tuple
that supports accessing the parts of the URL via named attributes as well as indexes. In addition to being easier to use for the programmer, the attribute API also offers access to several values not available in the tuple
API.
urllib_parse_urlparseattrs.py
from urllib.parse import urlparse
url = 'http://user:[email protected]:80/path;param?query=arg#frag'
parsed = urlparse(url)
print('scheme :', parsed.scheme)
print('netloc :', parsed.netloc)
print('path :', parsed.path)
print('params :', parsed.params)
print('query :', parsed.query)
print('fragment:', parsed.fragment)
print('username:', parsed.username)
print('password:', parsed.password)
print('hostname:', parsed.hostname)
print('port :', parsed.port)
The username
and password
are available when present in the input URL, and set to None
when not. The hostname
is the same value as netloc
, in all lower case and with the port value stripped. And the port
is converted to an integer when present and None
when not.
RUN:
python3 urllib_parse_urlparseattrs.py
output:
scheme : http
netloc : user:[email protected]:80
path : /path
params : param
query : query=arg
fragment: frag
username: user
password: pwd
hostname: netloc
port : 80
The urlsplit()
function is an alternative to urlparse()
. It behaves a little differently, because it does not split the parameters from the URL. This is useful for URLs following RFC 2396, which supports parameters for each segment of the path.
urllib_parse_urlsplit.py
from urllib.parse import urlsplit
url = 'http://user:[email protected]:80/p1;para/p2;para?query=arg#frag'
parsed = urlsplit(url)
print(parsed)
print('scheme :', parsed.scheme)
print('netloc :', parsed.netloc)
print('path :', parsed.path)
print('query :', parsed.query)
print('fragment:', parsed.fragment)
print('username:', parsed.username)
print('password:', parsed.password)
print('hostname:', parsed.hostname)
print('port :', parsed.port)
Since the parameters are not split out, the tuple API will show five elements instead of six, and there is no params
attribute.
RUN:
python3 urllib_parse_urlsplit.py
Output:
SplitResult(scheme='http', netloc='user:[email protected]:80',
path='/p1;para/p2;para', query='query=arg', fragment='frag')
scheme : http
netloc : user:[email protected]:80
path : /p1;para/p2;para
query : query=arg
fragment: frag
username: user
password: pwd
hostname: netloc
port : 80
To simply strip the fragment identifier from a URL, such as when finding a base page name from a URL, use urldefrag()
.
urllib_parse_urldefrag.py
from urllib.parse import urldefrag
original = 'http://netloc/path;param?query=arg#frag'
print('original:', original)
d = urldefrag(original)
print('url :', d.url)
print('fragment:', d.fragment)
The return value is a DefragResult
, based on namedtuple
, containing the base URL and the fragment.
RUN:
python3 urllib_parse_urldefrag.py
Output:
original: http://netloc/path;param?query=arg#frag
url : http://netloc/path;param?query=arg
fragment: frag
Unparsing
There are several ways to assemble the parts of a split URL back together into a single string. The parsed URL object has a geturl()
method.
urllib_parse_geturl.py
from urllib.parse import urlparse
original = 'http://netloc/path;param?query=arg#frag'
print('ORIG :', original)
parsed = urlparse(original)
print('PARSED:', parsed.geturl())
geturl()
only works on the object returned by urlparse()
or urlsplit()
.
RUN:
python3 urllib_parse_geturl.py
Output:
ORIG : http://netloc/path;param?query=arg#frag
PARSED: http://netloc/path;param?query=arg#frag
A regular tuple containing strings can be combined into a URL with urlunparse()
.
urllib_parse_urlunparse.py
from urllib.parse import urlparse, urlunparse
original = 'http://netloc/path;param?query=arg#frag'
print('ORIG :', original)
parsed = urlparse(original)
print('PARSED:', type(parsed), parsed)
t = parsed[:]
print('TUPLE :', type(t), t)
print('NEW :', urlunparse(t))
While the ParseResult
returned by urlparse()
can be used as a tuple, this example explicitly creates a new tuple to show that urlunparse()
works with normal tuples, too.
RUN:
python3 urllib_parse_urlunparse.py
Output:
ORIG : http://netloc/path;param?query=arg#frag
PARSED: <class 'urllib.parse.ParseResult'>
ParseResult(scheme='http', netloc='netloc', path='/path',
params='param', query='query=arg', fragment='frag')
TUPLE : <class 'tuple'> ('http', 'netloc', '/path', 'param',
'query=arg', 'frag')
NEW : http://netloc/path;param?query=arg#frag
If the input URL included superfluous parts, those may be dropped from the reconstructed URL.
urllib_parse_urlunparseextra.py
from urllib.parse import urlparse, urlunparse
original = 'http://netloc/path;?#'
print('ORIG :', original)
parsed = urlparse(original)
print('PARSED:', type(parsed), parsed)
t = parsed[:]
print('TUPLE :', type(t), t)
print('NEW :', urlunparse(t))
In this case, parameters
, query
, and fragment
are all missing in the original URL. The new URL does not look the same as the original, but is equivalent according to the standard.
RUN:
python3 urllib_parse_urlunparseextra.py
Output:
ORIG : http://netloc/path;?#
PARSED: <class 'urllib.parse.ParseResult'>
ParseResult(scheme='http', netloc='netloc', path='/path',
params='', query='', fragment='')
TUPLE : <class 'tuple'> ('http', 'netloc', '/path', '', '', '')
NEW : http://netloc/path
Joining
In addition to parsing URLs, urlparse
includes urljoin()
for constructing absolute URLs from relative fragments.
urllib_parse_urljoin.py
from urllib.parse import urljoin
print(urljoin('http://www.example.com/path/file.html',
'anotherfile.html'))
print(urljoin('http://www.example.com/path/file.html',
'../anotherfile.html'))
In the example, the relative portion of the path ("../"
) is taken into account when the second URL is computed.
RUN:
python3 urllib_parse_urljoin.py
Output:
http://www.example.com/path/anotherfile.html
http://www.example.com/anotherfile.html
Non-relative paths are handled in the same way as by os.path.join()
.
urllib_parse_urljoin_with_path.py
from urllib.parse import urljoin
print(urljoin('http://www.example.com/path/',
'/subpath/file.html'))
print(urljoin('http://www.example.com/path/',
'subpath/file.html'))
If the path being joined to the URL starts with a slash (/
), it resets the URL’s path to the top level. If it does not start with a slash, it is appended to the end of the path for the URL.
RUN:
python3 urllib_parse_urljoin_with_path.py
Output:
http://www.example.com/subpath/file.html
http://www.example.com/path/subpath/file.html
Encoding Query Arguments
Before arguments can be added to a URL, they need to be encoded.
urllib_parse_urlencode.py
from urllib.parse import urlencode
query_args = {
'q': 'query string',
'foo': 'bar',
}
encoded_args = urlencode(query_args)
print('Encoded:', encoded_args)
Encoding replaces special characters like spaces to ensure they are passed to the server using a format that complies with the standard.
RUN:
python3 urllib_parse_urlencode.py
Output:
Encoded: q=query+string&foo=bar
To pass a sequence of values using separate occurrences of the variable in the query string, set doseq
to True
when calling urlencode()
.
urllib_parse_urlencode_doseq.py
from urllib.parse import urlencode
query_args = {
'foo': ['foo1', 'foo2'],
}
print('Single :', urlencode(query_args))
print('Sequence:', urlencode(query_args, doseq=True))
The result is a query string with several values associated with the same name.
RUN:
python3 urllib_parse_urlencode_doseq.py
Output:
Single : foo=%5B%27foo1%27%2C+%27foo2%27%5D
Sequence: foo=foo1&foo=foo2
To decode the query string, use parse_qs()
or parse_qsl()
.
urllib_parse_parse_qs.py
from urllib.parse import parse_qs, parse_qsl
encoded = 'foo=foo1&foo=foo2'
print('parse_qs :', parse_qs(encoded))
print('parse_qsl:', parse_qsl(encoded))
The return value from parse_qs()
is a dictionary mapping names to values, while parse_qsl()
returns a list of tuples containing a name and a value.
RUN:
python3 urllib_parse_parse_qs.py
Output:
parse_qs : {'foo': ['foo1', 'foo2']}
parse_qsl: [('foo', 'foo1'), ('foo', 'foo2')]
Special characters within the query arguments that might cause parse problems with the URL on the server side are “quoted” when passed to urlencode()
. To quote them locally to make safe versions of the strings, use the quote()
or quote_plus()
functions directly.
urllib_parse_quote.py
from urllib.parse import quote, quote_plus, urlencode
url = 'http://localhost:8080/~hellmann/'
print('urlencode() :', urlencode({'url': url}))
print('quote() :', quote(url))
print('quote_plus():', quote_plus(url))
The quoting implementation in quote_plus()
is more aggressive about the characters it replaces.
RUN:
python3 urllib_parse_quote.py
Output:
urlencode() : url=http%3A%2F%2Flocalhost%3A8080%2F~hellmann%2F
quote() : http%3A//localhost%3A8080/~hellmann/
quote_plus(): http%3A%2F%2Flocalhost%3A8080%2F~hellmann%2F
To reverse the quote operations, use unquote()
or unquote_plus()
, as appropriate.
urllib_parse_unquote.py
from urllib.parse import unquote, unquote_plus
print(unquote('http%3A//localhost%3A8080/%7Ehellmann/'))
print(unquote_plus(
'http%3A%2F%2Flocalhost%3A8080%2F%7Ehellmann%2F'
))
The encoded value is converted back to a normal string URL.
RUN:
python3 urllib_parse_unquote.py
Output:
http://localhost:8080/~hellmann/
http://localhost:8080/~hellmann/
Convert python dictionary to a json string
July 16, 2020, 3:14 a.m.
267Convert Python Dictionary to a JSON string
test.py
import json
python_information = {'id': '1234567890', 'name': 'Naruto', 'job': 'Software Engineer', 'languages': [{'English': 'Professional'}, {'Japanese': 'Professional'}, {'Korean': 'Native'}]}
# Convert Python Dictionary into a JSON String:
p = json.dumps(python_information)
print(p)
Output:
$ python test.py
{"id": "1234567890", "name": "Naruto", "job": "Software Engineer", "languages": [{"English": "Professional"}, {"Japanese": "Professional"}, {"Korean": "Native"}]}
Note:
*if you try print(p["id"]), it is going to fail. Why? it is not a Python Dictionary anymore.
- The python dictionary`s data with a single quote is changed a JSON string with double quotes.
A JSON string is only possible with double quotes
A JSON string is not a data structure. Do not confuse with Python dictionary.
Convert json string to python dictionary
July 16, 2020, 3:09 a.m.
145Convert JSON string to Python dictionary
A JSON String is JavaScript object notation.
A JSON String is a serialization format.
Python Dictionary is a data structure that implements all of its own algorithms.
Python's Dictionary key can be any hash object, and JSON can only be a string.
*JSON is text, dictionaries are a data structure in memory.
import json
json_information = """
{
"id": "1234567890",
"name": "Naruto",
"job": "Software Engineer",
"languages": [
{
"English": "Professional"
},
{
"Japanese": "Professional"
},
{
"Korean": "Native"
}
]
}
"""
# Convert a JSON string to Python dictionary:
j = json.loads(json_information)
print(j["id"])
print(j["name"])
print(j["job"])
print(j["languages"])
print(j)
Output:
1234567890
Naruto
Software Engineer
[{'English': 'Professional'}, {'Japanese': 'Professional'}, {'Korean': 'Native'}]
{'id': '1234567890', 'name': 'Emily', 'job': 'Software Engineer', 'languages': [{'English': 'Professional'}, {'Japanese': 'Professional'}, {'Korean': 'Native'}]}
Python tools to write better code
April 8, 2020, 1:12 p.m.
321Python Tools To Write Better Code
Python community maintains a set of tools that are helpful in every project. They provide quick feedback of your code health and how much it sticks to standards and better practices.
These tools are:
1)pep8
style checker
2)pyflakes
checks source code for errors
3)mccabe
complexity checker
4)flake8
code checker (pep8, pyflakes, mccabe, and third-party plugins to check the style and quality of some python code)
5)Pylint
Checks for coding standards, errors and duplicated code.
6)Coverage
measure effectiveness of tests
7)Black
The uncompromising Python code formatter
How to operations and usage of sets on python
Nov. 16, 2019, 8:30 a.m.
280How to Operations and Usage of Sets on Python
- Set is a built in data-type(data-structure) in python.
- It only stores unique elements.
- It do not contains duplicate elements.
- We can iterate over sets.
What is set ?
How to create a set in python?
# case1: empty set with built-in keyword
s = set()
print(s)
# Output: set([])
print(type(s))
# Output: set
# case2: set with initial data
s = {1, 'a', '@', 'batta', 2.22}
print(s)
# Output: set([2.22, 'batta', 1, 'a', '@'])
print(type(s))
# Output: set
How to add an element to a set in python ?
s = {1,2,3,4,5,6,'a'}
print(s)
# Output: set(['a', 1, 2, 3, 4, 5, 6])
s.add(100)
print(s)
# Output: set(['a', 1, 2, 3, 4, 5, 6, 100])
How to remove an element from set in python ?
s = {1,2,3,4,5,6,'a'}
r = s.remove(5)
print(s)
# Output: set(['a', 1, 2, 3, 4, 6])
print(r)
# Output: None
s = {1,2,3,4,5,6,'a'}
r = s.discard(5)
print(s)
# Output: set(['a', 1, 2, 3, 4, 6])
print(r)
# Output: None
How to remove and get last element from set in python ?
s = {1,2,3,4,5,6,'a'}
r = s.pop()
print(s)
# Output: set([1, 2, 3, 4, 5, 6])
print(r)
# Output: 'a'
How to clear or empty the set in python?
s = {'a', 1,2,3,4,5,6}
s.clear()
print(s)
# Output: set([])
How to copy set1 to set2 ?
s1 = {1,2,3,4, 'a', 'hello', 1.255}
# id - it is a built-in function it returns the memory location of python object.
s2 = s1.copy()
print(id(s1))
# Output: 139864877446360
print(id(s2))
# Output: 139864877446128
# if we do not use 'copy'
s3 = {1,2,3,4}
s4 = s3
print(id(s3))
# Output: 139864878456528
print(id(s4))
# Output: 139864878456528
How to find difference of two sets in python ?
s1 = {1,2,3,4,5,6,7,8}
s2 = {4,5,6,9, 10, 11, 12, 13, 14}
difference = s1.difference(s2)
print(difference)
# Output: set([8, 1, 2, 3, 7])
# shortcut method
difference = s1 - s2
print(difference)
# Output: set([8, 1, 2, 3, 7])
# It do not modify the initial data in s1, s2
print(s1)
# Output: set([1, 2, 3, 4, 5, 6, 7, 8])
print(s2)
# Output: set([4, 5, 6, 9, 10, 11, 12, 13, 14])
How to update the difference of set1, set2 into set1 ?
s1 = {1,2,3,4,5,6,7,8}
s2 = {4,5,6,9, 10, 11, 12, 13, 14}
difference = s1.difference_update(s2)
print(difference)
# Output: None
print(s1)
# Output: set([1, 2, 3, 7, 8])
print(s2)
# Output: set([4, 5, 6, 9, 10, 11, 12, 13, 14])
How to find the common elements in set1, set2 in python?
s1 = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11}
s2 = {4, 5, 6, 9, 10, 11, 12, 13, 14}
intersection = s1.intersection(s2)
print(intersection)
# Output: set([4, 5, 6, 9, 10, 11])
How to check whether set1 and set2 contains common elements or not ?
# find if sets disjoint or not
set1 = set([1, 2.55, 3, "a"])
set2 = set(["hello", "world", "batta"])
is_disjoint = set1.isdisjoint(set2)
print(is_disjoint)
# Output: True
# Because there are no common elements
set1 = set([1, 2.55, 3, "a"])
set2 = set(["hello", 1, "a"])
is_disjoint = set1.isdisjoint(set2)
print(is_disjoint)
# Output: False
# common elements are 1, "a"
How to check if given set s1 is subset of other set s2 or not ?
# case1
set1 = {0, 1, 2, 3, "abcd", 5, 6, 7, 8, 9} # superset
set2 = {1, 3, 5} # subset
result = set2.issubset(set1)
print(result)
# Output: True
# case2
set1 = {0, 1, 2, 3, "abcd", 5, 6, 7, 8, 9} # superset
set2 = {1, 3, 5} # subset
result = set1.issubset(set2)
print(result)
# Output: False
# case3
set1 = {0, 1, 2, 3, "abcd", 5, 6, 7, 8, 9}
set2 = {1, 3, 5, "extra element"}
result = set2.issubset(set1)
print(result)
# Output: False
How to check if given set s1 is superset of other set s2 or not ?
# case1
set1 = {0, 1, 2, 3, "abcd", 5, 6, 7, 8, 9} # superset
set2 = {1, 3, 5} # subset
result = set1.issuperset(set2)
print(result)
# Output: True
# case2
set1 = {0, 1, 2, 3, "abcd", 5, 6, 7, 8, 9} # superset
set2 = {1, 3, 5} # subset
result = set2.issuperset(set1)
print(result)
# Output: False
# case3
set1 = {0, 1, 2, 3, "abcd", 5, 6, 7, 8, 9}
set2 = {1, 3, 5, "extra element"}
result = set1.issuperset(set2)
print(result)
# Output: False
How to combine(union) set1 and set2 ?
set1 = {0, 1, 2, 3, 5, 6, 7, 8, 9, 'abcd'}
set2 = {1, 3, 5, 'extra element'}
result = set1.union(set2) # it will not modify the original sets & returns union of two sets.
print(result)
# Output: {0, 1, 2, 3, 5, 6, 7, 8, 9, 'abcd', 'extra element'}
# check if original sets changed or not
print(set1)
# Output: {0, 1, 2, 3, 5, 6, 7, 8, 9, 'abcd'}
prnt(set2)
# Output: {1, 3, 5, 'extra element'}
How to update set1 with other set set2 ?
set1 = {0, 1, 2, 3, 5, 6, 7, 8, 9, 'abcd'}
set2 = {1, 3, 5, 'extra element'}
result = set1.update(set2) # it will update the operating set & returns None.
print(result)
# Output: None
print(set1)
# Output: {'abcd', 1, 2, 3, 5, 6, 7, 8, 9, 'extra element', 0}
print(set2)
# Output: {1, 3, 5, 'extra element'}
How to find symmetric difference of two sets ?
symmetric difference is known as the disjunctive union, of two sets is the set of elements which are in either of the sets and not in their intersection.
set1 = {0, 1, 2, 3, 5, 6, 7, 8, 9, 'abcd'}
set2 = {1, 3, 5, 'extra element'}
result = set1.symmetric_difference(set2) # It will not change the original sets.
# it removes common elements in both sets and returns the union of remaining elements.
print(result)
# Output: {0, 2, 6, 7, 8, 9, 'abcd', 'extra element'}
# print original sets
print(set1)
# Output: {0, 1, 2, 3, 5, 6, 7, 8, 9, 'abcd'}
print(set2)
# Output: {1, 3, 5, 'extra element'}
How to find symmetric difference of two sets and update the result in same set?
set1 = {0, 1, 2, 3, 5, 6, 7, 8, 9, 'abcd'}
set2 = {1, 3, 5, 'extra element'}
result = set1.symmetric_difference_update(set2)
# it removes common elements in both sets and updates set1 with union of remaining elements.
print(result)
# Output: None
# print original sets
print(set1)
# Output: {0, 2, 6, 7, 8, 9, 'abcd', 'extra element'}
print(set2)
# Output: {1, 3, 5, 'extra element'}
How to use "map" keyword or function in python?
Nov. 15, 2019, 10:45 p.m.
166How to use "map" keyword or function in python?
- "map' is a built-in function in python.
- It takes first argument as a function or a callable object.
- All other arguments must be sequence of elements otherwise it raises an error.
- we can reduce the number of lines code using map.
- It's like functional programming technique.
- map return a list of the results of applying the function to the items of the argument sequence(s).
- If more than one sequence is given, the function is called with an argument list consisting of the corresponding item of each sequence, substituting 'None' for missing values when not all sequences have the same length.
Convert list of numbers to list of strings without using map
l = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
output = []
for i in l:
output.append(str(i))
print(output)
# Output: ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
Convert list of numbers to list of strings using map
# I love to use it
l = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
output = map(str, l)
print(output)
# Output: ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
Return list of elements by adding numbers of two lists based on their indexes - Traditional way
l1 = [1, 2, 3, 4, 5, 6]
l2 = [10, 20, 30, 40, 50, 60]
output = []
for i, j in zip(l1, l2):
output.append(i+j)
print(output)
# Output: [11, 22, 33, 44, 55, 66]
Return list of elements by adding numbers of two lists based on their indexes - Functional Programming
l1 = [1, 2, 3, 4, 5, 6]
l2 = [10, 20, 30, 40, 50, 60]
def sum_elements(a, b):
return a + b
output = map(sum_elements, l1, l2)
print(output)
# Output: [11, 22, 33, 44, 55, 66]
Let's test the map with different types of inputs
how to use multiple arguments with "map" python?
def function(*args):
return args
output = map(function, [1,2,3], ("a", "b", "c"), (1.2, 2.3, 3.4))
print(output)
# Output: [(1, 'a', 1.2), (2, 'b', 2.3), (3, 'c', 3.4)]
how to use multiple arguments of varying length with "map" with python?
def function(*args):
return args
output = map(function, [1,2,3], ("a", "b", "c", "d", "e", "f"), (1.2, 2.3, 3.4))
print(output)
# Output: [(1, 'a', 1.2), (2, 'b', 2.3), (3, 'c', 3.4), (None, 'd', None), (None, 'e', None), (None, 'f', None)]
use "None" instead of function with "map" in python
output = map(None, [1,2,3], ("a", "b", "c"), (1.2, 2.3, 3.4))
print(output)
# Output: [(1, 'a', 1.2), (2, 'b', 2.3), (3, 'c', 3.4)]
use other data types like "list" instead of function with "map" in python
output = map([], [1,2,3], ("a", "b", "c"), (1.2, 2.3, 3.4))
# Output: TypeError: 'list' object is not callable
How to use "reduce" builtin function in python?
Nov. 15, 2019, 10:39 p.m.
151how to use "reduce" builtin function in python?
- reduce is a built-in function in python module "__builtin__".
- It takes function as first argument and second argument as an sequence of items.
- It applies function on two arguments cumulatively to the items of a sequence from left to right and returns a single value.
- If initial is present, it is placed before the items of the sequence in the calculation, and serves as a default when the sequence is empty.
- It doesn't allow empty sequence.
Let's see examples
Let's sum all the elements in a list
traditional way
l = [1, 2, 3, 5, 6]
sum_of_numbers = 0
for num in l:
sum_of_numbers += num
print(sum_of_numbers)
# Output: 17
Let's do it using "reduce"
l = [1, 2, 3, 5, 6]
def sum_numbers(a, b):
return a + b
sum_of_numbers = reduce(sum_numbers, l)
print(sum_of_numbers)
# Output: 17
Let's find factorial of number 5
traditional way
num = 5
result = 1
for i in range(1, num+1):
result = result * i
print("Factorial of 5 = %s" % (result))
# Output: 120
Let's do it using "reduce"
num = 5
result = reduce(lambda x, y : x*y, range(1, num+1))
print("Factorial of 5 = %s" % (result))
# Output: 120
Let's find out max number in a list of numbers using "reduce" and "max"
l = [3, 1, 5, 10, 7, 6]
max_num = reduce(max, l)
print("max num = %s" % (max_num))
# Output: max num = 10
How to use "filter" builtin function in python?
Nov. 15, 2019, 10:32 p.m.
141how to use "filter" builtin function in python?
- "filter" is a python's built-in function which can be found in module "__builtin__".
- It takes two arguments, first argument as a function and second argument as sequence of objects/elements.
- It passes all objects/elements to given function one after other.
- If function returns "True" then it appends the object/element to a list & returns the list after passing all elements.
- If sequence is a tuple or string, it returns the same type, else return a list.
- If function is None, return the items that are true.
Let's see an example for "filter"
Q. Find out the all prime numbers below hundred ?
1. Traditional way
num = 100
primes = []
for i in range(2, 100):
for j in range(2, i):
if i % j == 0:
break
else:
primes.append(i)
print(primes)
# Output: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
2. Using function "filter"
def is_prime(num):
for j in range(2, num):
if num % j == 0:
return False
else:
return True
primes = filter(is_prime, range(1, 100))
print(primes)
# Output: [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97]
print(type(primes))
# Output: list
Lets test filter by passing "None" as first argument
a = (0, 1, 2, 3)
l = filter(None, a)
print(l)
# Output: (1, 2, 3)
print(type(l))
# Output: tuple
It converts element to boolean if it returns true then it will add element to list/tuple/string.
Let's apply "filter" on strings.
Remove vowels from string in python.
def remove_vowels(char):
return char.lower() not in ['a', 'e', 'i', 'o', 'u']
s = filter(remove_vowels, "this is anjaneyulu batta")
print(s)
# Output: ths s njnyl btt
Usage of "datetime" from datetime module with use cases
Nov. 15, 2019, 10:12 p.m.
165Usage of "datetime" from datetime module with use cases
- datetime can be found in the module datetime.
- datetime is a python's representation of date and time in a single object.
how to create datetime object in python ?
from datetime import datetime
time_date = datetime(
year=2017, month=3, day=2, hour=14, minute=0, second=0, microsecond=0, tzinfo=None
)
print(type(time_date))
# Output: 'datetime.datetime'
print(time_date)
# Output: 2017-03-02 14:00:00
How to get date from datetime object ?
from datetime import datetime
time_date = datetime(
year=2017, month=3, day=2, hour=14, minute=0, second=0, microsecond=0, tzinfo=None
)
date = time_date.date()
print(type(date))
# Output: datetime.date
print(date)
# Output: 2017-03-02
How to get current datetime object in python ?
from datetime import datetime
current_datetime = datetime.now()
print(current_datetime)
# Output: 2017-03-07 21:31:59.720195
How to get year, month, day, hour, minute, second in python?
from datetime import datetime
current_datetime = datetime.now()
year = current_datetime.year
month = current_datetime.month
day = current_datetime.day
hour = current_datetime.hour
minute = current_datetime.minute
second = current_datetime.second
microsecond = current_datetime.microsecond
print("year= %s, month=%s, day=%s, hour=%s, minute=%s, second=%s, microsecond=%s" % (year, month, day, hour, minute, second, microsecond))
# output: year= 2017, month=3, day=7, hour=21, minute=41, second=50, microsecond=781903
How to convert a datetime object to string with a specific format python(datetime.strftime)?
"2017-02-04 12:02:33"
from datetime import datetime
date = datetime(2017, 2, 4, 12, 2, 33)
s = datetime.strftime(d, "%Y-%m-%d %H:%M:%S")
print(s)
#Output: '2017-02-04 12:02:33'
"2017/02/04 12:02:33"
from datetime import datetime
date = datetime(2017, 2, 4, 12, 2, 33)
s = datetime.strftime(d, "%Y/%m/%d %H:%M:%S")
print(s)
#Output: 2017/02/04 12:02:33
"04/02/2017 12:02:33"
from datetime import datetime
date = datetime(2017, 2, 4, 12, 2, 33)
s = datetime.strftime(date, "%d/%m/%Y %H:%M:%S")
print(s)
# Output: '04/02/2017 12:02:33'
"04 February 2017 12:02:33 PM"
from datetime import datetime
date = datetime(2017, 2, 4, 12, 2, 33)
s = datetime.strftime(date, "%d %B %Y %H:%M:%S %p")
print(s)
# Output: 04 February 2017 12:02:33 PM
"Saturday February 2017 12:02:33 PM"
from datetime import datetime
date = datetime(2017, 2, 4, 12, 2, 33)
s = datetime.strftime(date, "%A %B %Y %H:%M:%S %p")
print(s)
# Output: Saturday February 2017 12:02:33 PM
"Sat February 2017 12:02:33 pm"
from datetime import datetime
date = datetime(2017, 2, 4, 12, 2, 33)
s = datetime.strftime(date, "%a %B %Y %H:%M:%S %P")
print(s)
# Output: Sat February 2017 12:02:33 pm
How to convert a string to datetime object python(datetime.strptime)?
"2017-02-04 12:02:33"
from datetime import datetime
s = "2017-02-04 12:02:33"
d = datetime.strptime(s, "%Y-%m-%d %H:%M:%S")
print(d)
# Output: datetime.datetime(2017, 2, 4, 12, 2, 33)
"2017/02/04 12:02:33"
from datetime import datetime
s = "2017/02/04 12:02:33"
d = datetime.strptime(s, "%Y/%m/%d %H:%M:%S")
print(d)
# Output: datetime.datetime(2017, 2, 4, 12, 2, 33)
"04/02/2017 12:02:33"
from datetime import datetime
s = "04/02/2017 12:02:33"
d = datetime.strptime(s, "%d/%m/%Y %H:%M:%S")
print(d)
# Output: datetime.datetime(2017, 2, 4, 12, 2, 33)
"04 February 2017 12:02:33 PM"
from datetime import datetime
s = "04 February 2017 12:02:33 PM"
d = datetime.strptime(s, "%d %B %Y %H:%M:%S %p")
print(d)
# Output: datetime.datetime(2017, 2, 4, 12, 2, 33)
"04 February 2017 12:02:33 PM"
from datetime import datetime
s = "04 February 2017 12:02:33 PM"
d = datetime.strptime(s, "%d %B %Y %H:%M:%S %p")
print(d)
# Output: datetime.datetime(2017, 2, 4, 12, 2, 33)
"Saturday 04 February 2017 12:02:33 PM"
from datetime import datetime
s = "Saturday 04 February 2017 12:02:33 PM"
d = datetime.strptime(s, "%A %d %B %Y %H:%M:%S %p")
print(d)
# Output: datetime.datetime(2017, 2, 4, 12, 2, 33)
"Sat February 2017 12:02:33 pm"
from datetime import datetime
s = "Sat 04 February 2017 12:02:33 pm"
d = datetime.strptime(s, "%a %d %B %Y %H:%M:%S %p")
print(d)
# Output: datetime.datetime(2017, 2, 4, 12, 2, 33)
Usage of "date" from datetime module with use cases
Nov. 13, 2019, 11:48 p.m.
252Usage of "date" from datetime module with use cases
- Date is a reference to a particular day represented within a calendar system.
- We use "date" class to represent a day in calendar in python
- We import date from "datetime" module
How to represent "date" in python ?
from datetime import date
d = date(year=2017, month=4, day=3)
print(d)
# Output: 2017-04-03
How to get current date in python ?
from datetime import date
d = date.today()
print(d)
# Output: we get today's date
How get year, month, day from month ?
from datetime import date
d = date.today()
year = d.year
month = d.month
day = d.day
print("year = %s, month = %s, day = %s" % (year, month, day))
# Output: year = 2017, month = 3, day = 12
How to replace year, month, day in date object and get new date object ?
from datetime import date
d = date(year=2017, month=4, day=3)
# replace year with 2015
old_date = d.replace(year=2015)
print(old_date)
# Output: 2015-04-03
print(d)
# Output: 2017-04-03
# in the same way we can replace month, day
How to convert/parse string to date object python ?
'2017-04-03' >>> "%Y-%m-%d"
from datetime import datetime
s = '2017-04-03'
date_object = datetime.strptime(s, "%Y-%m-%d").date()
print(date_object)
# Output: 2017-04-03
'Monday 03 April 2017' >>> "%A %d %B %Y"
from datetime import datetime
s = 'Monday 03 April 2017'
date_object = datetime.strptime(s, "%A %d %B %Y").date()
print(date_object)
# Output: 2017-04-03
'Mon 03 Apr 2017' >>> "%a %d %b %Y"
from datetime import datetime
s = 'Mon 03 Apr 2017'
date_object = datetime.strptime(s, "%a %d %b %Y").date()
print(date_object)
# Output: 2017-04-03
How to parse date object to string format python?
from datetime import datetime
d = date(2017, 04, 03)
date_string = d.strftime("%Y-%m-%d")
# output: '2017-04-03'
date_string = d.strftime("%A %d %B %Y")
# output: 'Monday 03 April 2017'
date_string = d.strftime("%a %d %b %Y")
# output: 'Mon 03 Apr 2017'
How to convert date object to "datetime" object ?
from datetime import datetime, date
d = date(2017, 04, 03)
datetime_object = datetime(d.year, d.month, d.day)
print(datetime_object)
# output: 2017-04-03 00:00:00
print(type(datetime_object))
# output: datetime.datetime
How to compare two date objects ?
from datetime import date
date1 = date(2015, 12, 8)
date2 = date(2017, 11, 9)
# check if date1 > date2
is_date1_greater = date1 > date2
print(is_date1_greater)
# output: False
is_date1_greater = date2 > date1
print(is_date1_greater)
# output: True
How to get number of days between two dates ?
from datetime import date
date1 = date(2015, 12, 8)
date2 = date(2017, 11, 9)
time_delta = date1 - date2
print("days = %s" % (time_delta.days))
# output: days = -702
time_delta = date2 - date1
print("days = %s" % (time_delta.days))
# output: days = 702
How to get week day from date object ?
Return the day of the week represented by the date. Monday == 0, Tuesday == 1, ... , Sunday == 6
from datetime import date
d = date(2015, 12, 8)
print(d.strftime("%A"))
# Output: Tuesday
day_of_week = d.weekday()
print(day_of_week)
# Output: 1
How to get ISO week day from date object ?
Return the day of the week represented by the date. Monday == 1, Tuesday == 2, ... , Sunday == 7
from datetime import date
d = date(2015, 12, 8)
print(d.strftime("%A"))
# Output: Tuesday
day_of_week = d.isoweekday()
print(day_of_week)
# Output: 2