compsci bits and bytes: 2009

Saturday, April 25, 2009

Quicksort Algorithm

The quicksort algorithm is similar to merge sort in that it takes the same divide and conquer approach that the merge sort algorithm uses. Quicksort works by partitioning an array into two sub-arrays. Merge sort does the same thing, but quick sort does it by seperating the partition around a pivot point. The numbers less than the pivot point are placed into the lower array and those above into a higher array. The function then runs recursively continuing to sort the arrays until they are seperated into fully sorted arrays. The array ends completly sorted and merged back into one array. Quick sort is normally extremely fast in practice, although its worst case scenario running time is O(n^2), but the average case is O(n*logn^2) making it the fastest of the five sorting algorithims looked at.

Friday, April 24, 2009

Merge Sort

Merge Sort is a more advanced sorting algorithm it is more complex to code, understand, and implement, but it also provides a more efficient algorithm. The merge sort algorithm works by dividing the array into separate arrays and then sorting the smaller arrays and then eventually merging the arrays back together sorted. The sorting is done by recursive calls to the merge sort function. The array is divided as many times as it takes in order to get an array with only one element. Once an array with one element is reached the arrays are then placed back together with each array being put back in the proper order through recursive calls to the merge sort function. After each array has been sorted it a call to the function merge places it back into one array properly sequence by keeping track of the smallest element in each array and inserting the smaller of the two until every element has been placed back into the original array. The advantages that merger sort brings is that it has a time complexity of O(nlogn) as compared to O(n^2) of the other three algorithms looked at so far.

Thursday, April 23, 2009

Insertion Sort

Insertion sort is the third sorting algorithm that I will be explaining how it works as well as the advantages and disadvantages of its use. The idea of insertion sort is partitioning the array into a sorted and unsorted sections. Then you insert each element from the unsorted partition into its proper place in the sorted partition. This is easily accomplished by starting the sorter partition with only one element, the first element in the array, the other side will have n-1 elements. The algorithm will continue until eventually the sorted side will have n-1 elements and the unsorted side will have n elements. Then the last element is properly placed into the sorted section and the array has successfully been sorted. The advantages of of insertion sort is that it can be quicker than the other algorithms because it takes into account partially sorted arrays. The only problem is it is still not a very efficient algorithm because when the list is in the reverse order it will end up the same running time as the other two algorithms at big Oh (n^2).

Tuesday, April 21, 2009

Bubble Sort

Bubble sort is another sorting algorithm which can easily be implemented using C++ and other programming languages. The idea behind the bubble sorting algorithm is that you compare two elements which are adjacent to one another and exchange the elements if the larger one is not in the proper order. With each pass of the array the element becomes more and more sorted and eventually can sort the array into either ascending or descending order. An example of a few passes of a bubble sort algorithm is provided below.

Pass 1

21 15 43 10 23
15 21 43 10 23
15 21 10 43 23
15 21 10 23 43

Pass 2

15 21 10 23 43
15 10 21 23 43

Pass 3

15 10 21 23 43
10 15 21 23 43

Again the bubble sort algorithm is simple to understand and easy to implement. The disadvantages of bubble sort algorithm is that they can often have long running times due to a large time complexity and the need for many passes of through the entire array.

Saturday, April 18, 2009

Selection Sort

Selection sort is a relatively simple sorting algorithm. It can be used to sort the data elements within an array. To explain the algorithm I will refer to an array called array A, which contains n number of elements of a particular data type. To make the example easier to explain assume that it contains unsorted integer numbers. In order to get the array in the correct order you need to find the index of the largest element within the array. Once that number has been located it can then be swapped with the last index within the array. The algorithm then will run through the array again taking the next largest integer and swapping it into the index of [n-2]. This process will continue until the index of A[0] is reached. Below I have provided a trace of the selection sort algorithm.

Original Array: 29 10 14 37 13

After 1st swap: 29 10 14 13 37
After 2nd swap: 1310 14 29 37
After 3rd swap: 13 10 14 29 37
After 4th swap: 10 13 14 29 37

The selection sort algorithm is a simple algorithm to implement, but at times can have a relatively large time complexity. It also is an internal sorting algorithm which requires it to all be able to fit in primary memory.

Thursday, April 16, 2009

Sorting Algorithms

Sorting is the process of taking data and arranging it into either ascending or descending order. It is a very common process that is utilized often in computer programming. There are two categories of sorting, internal sorts or external sorts. An internal sort requires that the entire collection of data fits inside the computers main memory. An external sort requires that the data would have to be stored in secondary memory. The memory of a computer is divided into either the primary memory or secondary memory. Secondary memory would be held on the hard drive, while the primary memory is on the CPU itself. Internal sorts are able to be performed at a quicker rate than the external sort. Through the next several posts I will be looking at specific sorting algorithms and how each is implemented. There are certain assumptions that need to be made in order to understand the algorithm. The algorithm will be passed an array n elements of type int, and those will be sorted into ascending order.

Wednesday, April 15, 2009

C++ Standard Template Libraries

Many people are surprised to realize that they do not always have to write all the code for everything they are trying to do. Many data structures and commonly used functions are already written and stored in templates online. Those for C++ can be found here. The templates on the website contain the basic codes for common data structures like lists, vectors, stacks, queues and several other basic structures. The codes are categorized and easy to locate. The codes provided are the basics for the particular data structure and can then be modified to accomplish what it is that you need the code to do in particular. It is a very useful resource which allows to drastically cut down programming time. Most other computer programming languages have similar sites which contain codes for structures in those languages.

Monday, April 13, 2009

Queues

Another ADT is the queue. The queue is the opposite of a stack in that it is known as first in and first out data structure. All data insertions in a queue come in at one end of the list and all the deletions will come at the other end of the list. It can be thought of like the line at the supermarket. A person enters at one end is exits at the other end. There are several ways to implement a queue . The first is a pointer based implementation. The pointer based implementation is essential the same as a linked list with two extra pointers to track the first and last data elements in the queue. The second implementation is done through and array. The array is implemented by setting the max size and when it is completely filled the array dumps the data that was in the last element of the array and shifts all the other elements down to the next element. One of the common uses of queues are when a certain resource is needed by more than one part of the computer. This can commonly be found in the use of a printer since only one item can be printed at a time queues are often used to delegate the sequence of documents printed.

Sunday, April 12, 2009

Stacks

There are several different types of processing data in a computer. One ways is known as a stack. A stack is like a list in that it has a set of functions and set of data. A stack can be thought of as a last in first out data structure. When I was taught about stacks I was told to think of it like a game of cards in which you add a card to the top and the next person can either add a card to the top of the deck or take the card that is on the top of the deck. The same thing happens with stacks of data. When data is placed on the stack it is known as pushing data onto the stack, and when data is removed from the stack it is known as popping the stack. The stack also can inherit all of the same functions of the linked list and then additionally requires a push function, pop function, display and top functions. The display function will be used to display all the elements of the stack. The top function reads the top of the data. One of the limitations of stacks are that you can only access the top element on the stack.

Sunday, April 5, 2009

Linked Lists

A linked list is a specially designed class that allows for a set of objects to be grouped together with a set of operations. Lists can be used for many different computer science implementations. List usually contain a specific set of functions. The essential functions are the constructor, retrieve, insert, remove, isEmpty, make empty, printList, and search. Each of these have a specific function to keep the list properly linked. A linked list is implemented by creating a member of a specific data type. The member is then placed into a node. The node is then linked to the next node in the list. It works so that everytime a new node is added to the data structure a link is then placed between the previous node in the list. The functions in the linked list allow for it to be maintained and allow for data to be placed into specific place within the list. Linked lists come in several types, a single linked list is only linked once and is created using only two pointers. A double linked list is linked twice and require more pointers. There is also circulatory lists which are list which link the last node back to the first node.

Thursday, April 2, 2009

Recursion

Recursion is a function that that either directly or indirectly makes a call to itself. The idea behind recursion is that you can break down a large problem into smaller problems until it is finally just a simple problem to solve. Recursive solutions are somewhat difficult to create because they must be made very carefully. The idea of the solution is for each successive call of the function should bring you closer and closer to an easily solved solution. The easily solved solution is known as the base case. Every recursive case is required to have at least one base case. It also must have at least one general case. The general case is any of the recursive steps that you take in order to get to the base case. A common way to look at a recursive case is by examining a function that computes the Fibonacci number.

int Fibonacci(int n){

if (n == 0)

return 0; // base case

if (n == 1)

return 1; // base case

return Fibonacci(n – 1) + Fibonacci(n– 2); // recursion}

One problem with recursion is that many times you can create an infinite recursion which is similar to an infinite loop. It is a fatal error that will cause the program to never stop running because it cannot get out of infinite recursion. However, recursion can sometimes be the best solution to a problem.

Thursday, March 19, 2009

Algorithm Analysis

Algorithm Analysis is another essential part of the computer programming process. Algorithm analysis is analyzing the algorithm that you have created for efficiency and running time. There are different ways to solve programming problems and many times one way is way more efficient than another. In algorithm analysis you are presented with ways to decide how long a certain algorithm will take to run. There are several different types of analysis that can be done. The type of analysis I will focus on is asymptotic analysis. Asymptotic analysis is a technique of graphing different scenarios in different types of notation to get a picture of what the running time would be. Big Oh notation is one of the types of asymptotic analysis. This analysis allows you to find the worst case scenario of an algorithm and ensure that the algorithm is running below that time. Big omega is analysis which finds the best case scenario and theta notation which finds the average case scenario. Using these notations you can analyze the algorithm to get a picture of the running time of a program and then compare which algorithms are more efficient than others.

Pointers in C++

Pointers are a key feature within the C++ programming language. Pointers are variable used to point to different data within a program. An analogy used to describe pointers is that your address is a pointer to your home. The pointer has its own memory address. Pointers must be initialized to a specific data type. A pointer of type int can only point to int data, while a pointer of type string can only point to data of type string. Pointers are used for many different uses within C++.

One of the most essential uses of a pointer is to use it to point to the different elements of an array of dynamic memory. Dynamic memory is memory created with the new operator which enables you to easily remove the memory after it is no longer necessary for to the program. This is necessary in C++ because it does not have garbage collection. This makes the pointer essential to programming in C++. Once memory is dynamically allocated you can reference it using the pointer. The use of the pointer to reference the memory saves memory space and allows the programmer to function at a higher efficiency than if programmed without the use of the pointer and dynamic memory allocation.

Wednesday, February 25, 2009

C++

C++ is a commonly used programming language. It was developed by Bjarne Stroustrup in 1979 as an upgrade from the then commonly used C language. C++ began to incorporate a newer style of programming which is called object oriented programming. C++ first began to incorporate classes and then added member functions, operator overloading, exception handling and several other more advanced features now common in computer programming. C++ is now used readily in many different types of programming. It is commonly used to develop system software, device drivers, and is also implemented in video game development. C++ is like other high level languages uses keywords as well as variables and operators to develop the code for computer softwares. These keywords and variables become the source code for the software. The source code is then put through a complier. The compiler is a program itself that takes the source code and converts it into binary code which the computer can understand. The compiler will also reveal some of the errors that computer programmers will face during the programming process.

Tuesday, February 17, 2009

Languages

Thus far in my computer science studies I haven't had a ton of practice with many of the computer programming languages that are in wide use today. I have used Java and begun to work with C++. I have realized that there are many similarities between the two languages in both syntax and overall style, but Java and C++ are only a small part of the spectrum of computer programming languages.

In software development it is fairly common that the software is developed using many different computer languages. Windows Vista, for example, was developed using C#, C++, .Net framework and several other languages. Although Vista is an operating system which is complex computer software, it is common practice to use different languages for a single software. This is done because certain languages are best suited for different parts of programming. Many times this will be done by different programmers that specialize in different languages. I would like to take a look into some of the different languages over the next few posts and focus on what languages are better suited for different types of programming. This allows for a better understanding of the programming process and the different tools utilized during it.

Tuesday, February 3, 2009

A Constant Change

Computer programming is the basis of software development and is a quickly evolving science. Technological advances are constantly changing the programing industry. Only forty years ago computers were programmed using binary code and punch cards. Now all software is developed using high level programming languages like Java and C++. Software development will always change so software developers must always learn the advancements of technology. The software development process can be broken down into steps that will always be useful in development though.

The process begins with analyzing the product that needs to be developed. Usually this starts with identifying the user's needs and then designing and testing a solution. The solution is referred to as an algorithm. Once the algorithm has been developed it is then coded by a computer programmer. Many times the algorithm is developed by one group of people and the coding is done by another set of people, but this is not always true and I will analyze the differences as I break down recently developed software in future posts.

compsci bits and bytes