Seven Things Python Developer Should Know

By

IIH Global

Hey, people, I am back !! This time with something new and much more interesting, let me share with you something, as I am an Electronics and Communication Engineer, so my interest should lie in Signal and System, Electronics Devices and Circuits, etc, but I am much more inclined towards Python. The name “Python” isn’t itself sounds interesting. To me, you can say I was attracted to Python, so to know about it I did a little research.

Firstly, with the advancement of technology, you will see a lot of technologies in the market starting from website to mobile development like PHP, Laravel, Dot Net, Java, Angular, Node.JS, and many more. So, What is Python? The answer to this is simple, it’s a programming

Language, which is getting popular day by day. The creator of this language is Guido van Rossum, and it was released in the year 1991.

The very next thing which strikes my mind is What is the purpose of using it? It is basically used for web development (server-side), software development, mathematics, and system scripting. Python can do all the following things:

Used to handle the big data and to perform complex mathematics.
Used on a server mostly, to create web applications.
Used for, production-ready software development or for rapid prototyping.
Connect to database systems. It can also modify and read files.
Used alongside software to create workflows.

While reading all these things, I did notice that why to consider Python, not other languages. The reason is it works on different platforms (Windows, Mac, Linux, Raspberry Pi, etc). It has a simple syntax similar to our English language with influence from mathematics. Plus it has a syntax that allows developers to write programs with fewer or lesser lines than some other programming languages. In addition to that prototyping can be very quick, just because Python runs on an interpreter system, that is code can be executed as early as it is written.

As opposed to other programming languages, Python uses new lines to complete a command, which often uses semicolons or parentheses. And it relies on indentation, using whitespace, to define scope; such as the scope of loops, classes, and functions. Other programming languages mostly use curly-brackets for this purpose.

Here some more eyes on Python programming,

Python is not the most used language on the web, however, it is constantly growing - especially in a startup environment where time and budget are usually limited. As a language, it is aspect-oriented which means there are modules with different functionality. So first the developer has to create the modules and afterward, based on the “if-then” action, depending on the user’s action, the algorithm triggers a particular block and brings the result. The Python developer is usually dealing with back-end components, apps connection with third-party web services and giving support to frontend developers in web applications. Of course, you might create applications with the use of different languages but pretty often Python is the language chosen for it - and there are reasons for that!

Quite obvious, Python is the main language that you are going to use at work to finish the project. Fortunately, if you are a developer but focused on other languages, the language switch might come with ease. Python is a general-purpose programming language with constantly increasing demand for. Due to the relatively easy learning path, it is beginner-friendly and definitely experienced-developer-friendly as well!

There is no need to know each module but at least, regardless of basic syntax and semantics, the differences between Python 2 and 3. Good Python developers can smoothly adjust to those, however, it is not a big deal because the distinction is rarely needed. It is also advisable to know python’s data structures. While you do not have to learn by heart how to implement a b-tree, knowing what lies under the hood of a set, or list will come in handy both in small and big projects.

Among the three versions of Python; Python 3 is the most recent major version of Python. However, Python 2, is still quite popular, although not being updated with anything other than security updates.

Python Framework

Knowing Python frameworks is a must, but, it does not mean that a Python developer has to know them all. Depending on the project you may be asked to know one or another, but mostly used are Django, Flask, and CherryPy. Undoubtedly, if you already know Python, you had a chance to work with at least one of the most popular frameworks! The basic and well-defined structure offered by the frameworks is usually appreciated by devs while figuring out the core logic of the application.

Using and connecting applications through an ORM (Object Relational Mapper) like SQLAlchemy, Django ORM and so on is easier, faster and more efficient than writing SQL - which means, more likely it is preferred by the team. Good to have it in your skillset!

Basic understanding of front-end technologies (HTML5, Javascript, CSS3)

Very often a Python developer has to cooperate with the frontend team to make a match the server-side with the client-side. Thus, it is important that you understand how the frontend works, what is possible and what is not, how the application is going to appear. Of course, in proper agile software houses, there is also a UX team, project/product manager and SCRUM master to coordinate the workflow. It doesn’t mean that a Python developer should have the knowledge of frontend but definitely, in some projects, this kind of knowledge and experience is more than welcome.

Python Libraries

Libraries make a developer’s life easier, the team’s workflow more efficient and the task’s execution way faster. Depending on the project nature, it is better to know the libraries which are going to help you in everyday work. Python, as a community-based programming language, has an answer to almost any possible request - check the list of just 20th commonly used Python libraries and you will know exactly what are we talking about!

Version Control

Keeping track of every change made to the file to later on source the code is a must-know for each developer! In most of the job offers you can see this as a requirement - thankfully it is not difficult to get familiar with and if you have been coding as you have properly set your GitHub and terms like “push, fork, pull, commit” are not random words for you.

AI and Machine Learning

This will be a huge plus for you if you know what is it about! AI and Machine Learning (as well as deep learning) are constantly growing as a field - Python is a perfect programming language for that. If you are into data science, then definitely involving the Machine Learning topic would be a great idea for you.

Communication Skills

Let’s not forget that a developer’s work is not only typing the lines of code! In best software development firms the teams are made out of amazing programs which work together to achieve the final goal - no matter if it means to finish the project, to create a new app or maybe to help a startup skyrocket. However, working in a team means that a developer has to communicate well - not only to get the stuff done but also to keep the documentation clear so others can easily read and then follow the thinking path to completely understand the idea.

Now after diving more inside Python, I found there are certain things which developers should know about this language:

In many Macs and PCs have already installed python. Eager to check: if it is installed on your Windows PC run the following on the Command Line (cmd.exe) or search in the start bar for Python.

C:\Users\Your Name>python --version

If you have Python installed on your Mac or Linux then, for Mac open the terminal and for Linux open the command line and type:

python --version

If you want to be a developer, then Python is an interpreted programming language, this means that as you write Python (.py) files in a text editor and then put those files to get executed into the python interpreter. The path to run a python file is like mentioned below on the command line:

C:\Users\Your Name>python helloworld.py,

Here, the name of your python file is “helloworld.py”.

Seems exciting, let's write this file name in any text editor as

helloworld.py

print("Hello,World!")

It will look like this--

Easy as that. Just save your file. Open your command line, navigate to the directory where you have saved your file, and run:

C:\Users\Your Name>python helloworld.py

The output should read:

Hello, World!

Congratulations!!! you have written your first Python program and executed it.

Isn’t it interesting! The practical side of this Python, let's explore some more facts about it, which even the developers don’t know more about. To begin with

Python Version Numbers

Since this is technically not a programming feature, it still important to know the current versions of Python and therefore, we’re all on the same page. Versions of Python are numbered or can be represented as A.B.C., where the three letters represent (in decreasing order) important changes in the language. So, for instance, going from 2.7.3 to 2.7.4 means that the Python build made some minor bug fixes while going from 2.xx to 3.xx represents a major change. Note the ‘x’ here, which is intentional; if a Python feature can apply to version number 2.7.C for any valid ‘C’ value, then we put in an ‘x’ and refer to Python 2.7.x. We can also remove the ‘x’ entirely and use 2.7.

Talking about July 2013, Python had two stable versions commonly used: 2.7 and 3.3. Less important is the third character, but right now 2.7.5 and 3.3.2 are the current versions, both of which were released on May 15, 2013. The short answer is that, while both 2.7 and 3.3 are perfectly fine to use, 3.3 is the future of the language and someone just starting Python today should probably use Python 3.3 over 2.7. Of course, if one is in the middle of an extensive research project that makes heavy use of 2.7, then it might not make sense to upgrade to 3.3 right away. This is actually quite similar to my current situation since I’m using a good number of my own Python 2.7 scripts to help me analyze algorithmic combinatorics on words. During August, I’ll completely transition to Python 3.3.

One of the most important things, though, is that Python 3 intentionally backward is incompatible. Backward compatibility is often a desired feature of programming languages and software that routinely undergo revisions since it means that input from older versions (e.g. older Python programs) can still run under the latest builds. In this case, Python 2 code will not be guaranteed to run successfully if using Python 3, so some conversion may be necessary to allow code to properly run. Backward incompatibility was necessary in order to allow Python 3 to be more concise, clear and to use additional features.

In the meantime, what’s the difference between Python 2 and 3, anyway That’s beyond the scope of this post, but I’ve added references at the end of the section based on the official Python documentation. I suppose the main improvement is that there’s better Unicode support, but there have also been some minor fixes here and there, improving some of the annoying features of 2.7. Still, while there are enough changes to warrant a 2.x to 3.x change, Guido van Rossum says in his overview that:

However, after eating away the changes, you’ll probably notice that Python really hasn’t changed all that much – by and large, we are mostly fixing well-known warts and annoyances, and removing a lot of old cruft.

A note: Guido van Rossum is Python’s creator, and still maintains his leadership over the programming language’s development, so if there’s something he says about Python, it can usually be taken as correct.

By the way, if you’re curious to see your version of Python, you can simply paste the following into a program:

import sys

print("My version Number: {}".format(sys.version))

Here, the text written inside the quotation marks gets printed as it is, except for the brackets { and }, which transform into the Python version or in other words, the sys.version. This is classic string formatting.

Alternatively, if you are using a Mac or Linux computer then you can do the same stuff directly with the help of the Mac OS X Terminal or Python interpreter on the command line interpreter (i.e. Unix Shell). Windows users will need to install third-party software, such as Cygwin, since there is no built-in command-line interface. But in the meantime, to reach up to our next point we need this incredibly useful Terminal.

Using the Python Shell

Without having any doubt in my mind I found that one of the most useful aspects of Python is that it comes auto-installed with its own shell. The Python Shell can be executed by typing in python from the command line (in Windows, there should be an application that you can just double-click). By doing so, you’ll see the default version number, a copyright notice, and three arrows (or “r-angles” in LaTeX-speak) >>> asking for your input. If you have multiple versions of Python installed, you may need to add in the version number python3.3 to get the correct version.

So why is the Python shell so useful? Simply put, it lets you test out simple commands in isolation. In many cases, you’ll be able to detect if there’s going to be a syntax or logical error in some command you want to use before it gets tested in some gigantic script that could be time-intensive or can consume memory.

Below, I’ve included a screenshot of me performing some commands on my MacBook Pro’s Terminal, using the 3.3 Python shell.

This set of commands sets up an empty list (list1) and adds to it all the even integers in the range [2,16)[2,16).

For the sake of showing why this shell might be useful, suppose I had forgotten to put in the last parameter of range so I had typed in range(2,16) instead. Then when I printed the contents of the list after the for loop, I would have seen all the numbers between 2 and 15 inclusive, rather than just the even numbers. But since I want only the even numbers in my “real” program that I’ve been working on, this would remind me to add in that last parameter. It is a very silly example, but still, it shows how to check what you do in the shell before you insert it in a real program that can be beneficial. I’ll be using the shell in some other code bits in this post, which you can recognize by the three “ranges” >>>.

Other popular languages such as Java and C++ have their own versions of the Python shell, but I think you would need to install something. And installing Linux-style programs can be nontrivial since often times there is no nice clickable GUI that can do the installation immediately.

Using ‘os’ and ‘sys’

I find both the sys and os modules to be incredibly useful to me for the purposes of convenience and generality.

First, let’s go over the sys module. Possibly the biggest advantage that it offers to the program is the use of command line inputs to the program. Say you’ve built a large program that will perform some task that depends on inputs from the user. For instance, in my machine learning class last semester, I implemented the k-means clustering algorithm. This is a learning algorithm that is given data and can classify it into groups depending on how many clusters are given as input. It’s clear that this can be useful in many life applications. Someone who has standardized data on medical patients’ records (e.g. blood-sugar levels, height, weight, etc.) may want to classify patients into two “clusters,” which could be (1) healthy or (2) ill. Or perhaps there could be n clusters, where patients classified into lower numbered clusters have a better outlook than those with high numbers.

To perform k-means clustering, then, we logically need two inputs: (1) the data itself and (2) the number of clusters. One idea is to put these directly into the program, and then run it. But what happens if we want to keep changing the data file we’re using or the number of clusters? Each time the program has finished executing, we’d have to go back into our text editor and modify it before re-executing it.

A better way would be to use command line arguments. Changing inputs on the command line is usually faster than opening a text editor and retyping the variables. We can do this with sys.argv, which takes in input from the command line. As an extra protection, one can also make sure that the user inputs the correct number of parameters. I have done this with the following code snippet from the k-means clustering algorithm.

import sys

# If number of parameters is incorrect, terminate.

if (len(sys.argv) != 3):

print("USAGE: kmeans_clustering.py [file] [clusters]")

sys.exit()

num_clusters = int(sys.argv[2])

with open(sys.argv[1], 'r') as feature_file:

# Do stuff with the file

Here, sys.argv represents the list of command-line arguments, with the name of the code as the first element. If I’ve put in the correct parameters, then the program should proceed smoothly, with sys.argv[1] and sys.argv[2] seamlessly incorporated.

In addition to speed, another advantage of command-line arguments is that they can be used as part of a process to automate the same script over and over again. Suppose I wanted to run my kmeans_clustering script over and over again with the cluster value ranging from 2 to 100. One way is to tediously call kmeans_clustering on the command line with ‘2’ as the last parameter, then do the same with ‘3’ after the first run has finished, and then do ‘4’ and so on. In other words, I’d have to call the program 99 times!

A better way is to make another Python script and use os.system to call kmeans_clustering as many times as I want. And this is as easy as changing the input to os.system. It takes in a string, so I would set a for loop that would create its unique string which would then act as input to the command line. See the following code for an example, where file1.arff is the made-up file that I’m using as an example.

import os

for i in range(2,101):

input_string = "python kmeans_clustering file1.arff " + str(i)

os.system(input_string)

So now this program will call kmeans_clustering 99 times automatically, each time with a different parameter for the number of clusters. Quite useful, isn’t it? This is one of the biggest benefits of using a program to call another program. Just be wary that if you make a change to a program while another script is calling it, then those changes will be reflected the next time the program gets called.

List Comprehension

In my opinion, list comprehension, or the process of forming lists out of other lists or structures, is something that exemplifies the beauty and simplicity of Python programming. Remember the code I wrote earlier which set up a list that contained all the even integers in [2,16)[2,16)? I could have just written the following one-liner:

>>> list1 = [i for i in range(2,16,2)]

>>> list1

[2, 4, 6, 8, 10, 12, 14]

To understand the syntax, it’s helpful to refer to the (old) Python 2.7.5 documentation, which has a nice explanation (emphasis mine):

“A list display yields a new list object. Its contents are specified by providing either a list of expressions or a list comprehension. When a comma-separated list of expressions is supplied, its elements are evaluated from left to right and placed into the list object in that order. When a list comprehension is supplied, it consists of a single expression followed by at least one for clause and zero or more for or if clauses. In this case, the elements of the new list are those that would be produced by considering each of the if or if clauses a block, nesting from left to right, and evaluating the expression to produce a list element each time the innermost block is reached.”

In other words, we’ll be given some expression that becomes an element of the list, and it will be subject to some restriction based on our series of for or if clause. Sometimes, if there are multiple loops and conditionals to evaluate, it can be more easily viewed if split into multiple chunks. I do this in the comments in the below code example. (If it’s absolutely necessary to introduce line breaks to better understand some list comprehension, the code might be a tad too complicated, but I believe it’s perfectly fine to use the list comprehension in this example.)

list2 = [(x, x**2, y) for x in range(5) for y in range(3) if x != 2]

'''

list2 = [(0, 0, 0), (0, 0, 1), (0, 0, 2), (1, 1, 0), (1, 1, 1), (1, 1, 2), (3, 9, 0), (3, 9, 1), (3, 9, 2), (4, 16, 0), (4, 16, 1), (4, 16, 2)]

This expression can be easily understood as:

list2 = [(x, x**2, y) for x in range(5):

for y in range(5):

if x != 2]

'''

The document indicates that it’s also possible to create nested lists through list comprehension. This can only be useful if one wants to initialize something like a matrix or a table. When I wrote my first Python program about a year ago, I also used list comprehension to construct a table of elements that I would update as part of a dynamic programming algorithm.

>>> list3 = [[0 for i in range(3)] for i in range(3)]

>>> list3

[[0, 0, 0], [0, 0, 0], [0, 0, 0]]

Classes and Functions

It’s pretty simple to define a function in Python, using def, such as the following trivial example, which counts the number of zeros in the input, which will be a string.

def count_zeros(string):

total = 0

for c in string:

if c == 0:

total += 1

return total

print(count_zeros('00102')) # Will return 3

Recursive functions are also straightforward and behave as in most major object-oriented programming languages.

In comparison to Java, here I have not used too many classes in my Python Programs, so my expertise in this domain is quite limited. Still, classes are an important part of object-oriented languages, and Python is (contrary to some people’s opinions) object-oriented, so it’s worth it to read the Classes documentation if you have the time. The document I just attached to, though, states affirmatively that:

“Python classes provide all the standard features of Object Oriented Programming: the class inheritance mechanism allows multiple base classes, a derived class can override any methods of its base class or classes, and a method can call the method of a base class with the same name. Objects can contain arbitrary amounts and kinds of data. As is true for modules, classes partake of the dynamic nature of Python: they are created at runtime and can be modified further after creation.”

File Management

With many Python scripts using files as input, such as the kmeans_clustering code I posted earlier, it’s important to know the correct ways to incorporate files in one’s code. The official documents explain that the open keyword is used for this particular purpose. It is pretty straightforward, and we can loop through the file to analyze it line by line. Alternatively, we can use the readlines() method to create a list consisting of each line in the file, but just be wary if the file is large.

f = open('test.txt', 'r')

for line in f:

# Read each line

f.close()

The f.close() is important since it’s done to free up memory. From the official documentation:

I almost never use f.close() though, because I can use them with a keyword, which will automatically close the file for me once the code exits its block. In fact, I posted an example of this earlier when I talked about my kmeans_clustering code. Here are the relevant parts of it reproduced below.

# Note: sys.argv[1] is the file, and 'r' means I'm reading it

with open(sys.argv[1], 'r') as feature_file:

all_lines = feature_file.readlines()

for i in xrange(len(all_lines)):

# Do stuff here

Now, in most cases, I believe that you don’t absolutely need to take advantage of f.close(), since if a script that was reading in a basic text file (but doesn’t close it) finishes running, then that text file should automatically be closed anyway. I can imagine that things can get more complicated with multiple files and scripts running together, so I’d get in the habit of using with to read in files.

If you’re interested in writing files, you can change the r parameter to w (or r+ which will enable both reading and writing) and use the file.write() method. This is common in research settings, where you might have to modify text files in accordance with some experiments.

Copying Structures (and Basic Memory Management)

Since it’s so simple to make a list in Python, one might think copying is also straightforward. When I was first starting out with the language, I would often try to make separate copies of lists using simple assignment operators.

>>> list1 = [1,2,3,4,5]

>>> list2 = list1

>>> list2.append(6)

>>> list2

[1, 2, 3, 4, 5, 6]

>>> list1

[1, 2, 3, 4, 5, 6]

Notice what happens? I make list1 and try to make list2 be a copy of that list via assignment. But if I modify list2, such as by adding in the number 6, it also modifies list1! This is a deceptive but important point. Making lists equal to other lists will essentially create two variable names pointing to the same list in memory. Hence, this will apply to any ‘container’ item, such as the dictionaries.

Since simple assignment does not create distinct copies, Python has a built-in list statement, as well as generic copy operations. It’s also possible to perform copies using slicing. Some solutions are shown below.

>>> list3 = list(list1)

>>> list1

[1, 2, 3, 4, 5, 6]

>>> list3

[1, 2, 3, 4, 5, 6]

>>> list3.remove(3)

>>> list3

[1, 2, 4, 5, 6]

>>> list1

[1, 2, 3, 4, 5, 6]

>>> import copy

>>> list4 = copy.copy(list1)

There’s also another thing to be worried about — what if you have containers within containers? While implementing a machine learning algorithm last semester, I ran into the problem of copying dictionaries that contained dictionaries. I thought I was okay using the straightforward dict operation, but I realized that if I changed a dictionary within one of those dictionaries, that change would be reflected in both of the larger dictionaries!

An example of this error is shown below, where I modify dict2s first dictionary by adding in the 'z1' -> 60 mapping. That key-value pair will also be reflected in dict1s first dictionary.

>>> dict1 = {'a': {'x1' : 20, 'y1' : 40}, 'b': {'x2' : 15, 'y2' : 30}}

>>> dict2 = dict(dict1)

>>> dict2

{'a': {'x1': 20, 'y1': 40}, 'b': {'x2' : 15, 'y2': 30}}

>>> dict2['a']['z1'] = 60

>>>

>>> dict2

{'a': {'x1': 20, 'y1': 40, 'z1': 60}, 'b': {'x2': 15, 'y2': 30}}

>>> dict1

{'a': {'x1': 20, 'y1': 40, 'z1': 60}, 'b': {'x2': 15, 'y2': 30}}

The solution to this is to use the deepcopy method, which will copy everything. This is the most memory-intensive operation of the copying solutions I’ve discussed here, so use it only if the other methods won’t work.

Dictionaries and Sets

Lists are by far the most common data structure I use when Python programming, but I still make extensive use of dictionaries, sets, and other data structures since they have their own advantages.

A set is simply a container that holds items, like a list, but only holds distinct elements. That is, if you add in element X to a set already containing X, the set doesn’t change. This can be an advantage of sets over lists, since I often need to ignore duplicates when I’m managing lists, and making a set based on a pre-existing list is as easy as typing in set(list_name).

But possibly an even bigger advantage with sets is their super fast lookup. Testing if an element is in a list takes O(n)O(n) time. With sets, however, membership testing is constant, O(1)O(1)-time. Of course, this requires set elements to be hashable, which means that items need to be associated with some constant number (i.e. “hash value”) so that they can be looked up in a table quickly.

>>> example1 = [i for i in range(5)]

>>> example2 = [i for i in range(3,8)]

>>> example3 = example1 + example2

>>> example1

[0, 1, 2, 3, 4]

>>> example2

[3, 4, 5, 6, 7]

>>> example3

[0, 1, 2, 3, 4, 3, 4, 5, 6, 7]

>>> set_example1 = set(example3)

>>> set_example1

{0, 1, 2, 3, 4, 5, 6, 7}

Of course the downside with sets over lists is that they don’t support indexing of elements, so there’s no ordering. This is a pretty big drawback, but regardless, if you don’t care about order and duplicates, and want speedy membership testing, sets are the way to go.

In addition to sets, I find an incredibly useful data structure, dictionaries. A dictionary is something that associates to each key a value, so it’s essentially a function that pairs up elements together.

>>> dict_example = {'Bob' : 21, 'Chris' : 33, 'Dave' : 40}

>>> dict_example

{'Bob': 21, 'Dave': 40, 'Chris': 33}

>>> dict_example['Adam'] = 11

>>> dict_example

{'Adam': 11, 'Bob': 21, 'Dave': 40, 'Chris': 33}

>>> dict_example['Bob']

21

There are many scenarios where dictionaries are useful. As an added benefit, searching the values by key is efficiently done in constant time, just like in sets. Due to their widespread use, dictionaries are one of the most heavily optimized data structures in basic Python.

Conclusion

I hope you might have liked my post, the comprehensive overview of Python. Obviously, this will totally depend on your prior skill level and experience with programming. In order to really understand these concepts, though, I insist you write up some programs as I did, if you practice then you become expert. If you have something interesting about Python, then you should write it, as knowledge increases as we share it, or you can google around if there are some solvable concepts in Python. Once you start with Python, then Don’t fall in love with it!! Have fun with programming in Python.