UNIT-5 SELECTED TOPICS

Algebraic computation

Algebraic computation in the context of Design and Analysis of Algorithms (DAA) typically involves applying algebraic techniques to analyze the performance of algorithms, particularly in terms of time complexity, space complexity, and other resource requirements.

Here are some key aspects of algebraic computation in DAA:

1. Big-O Notation:

Big-O notation is used to describe the asymptotic upper bound of an algorithm's complexity. It expresses how the runtime or space requirements grow as the size of the input increases.

Algebraic computation involves finding a function f(n)f(n)f(n) that bounds the behavior of the algorithm as a function of input size nnn.

2. Recurrence Relations:

Recurrence relations often arise when analyzing recursive algorithms.

For example, the time complexity of a divide-and-conquer algorithm like Merge Sort is expressed by the recurrence relation: T(n)=2T(n2)+O(n)T(n) = 2T\left(\frac{n}{2}\right) + O(n)T(n)=2T(2n)+O(n)

Solving such recurrences involves algebraic techniques such as substitution, iteration, or using the master theorem.

3. Solving Recurrences:

Substitution Method: Guess the form of the solution and prove it by induction.

Iteration Method: Unfold the recurrence step by step and sum the results.

Master Theorem: Provides a straightforward way to analyze divide-and-conquer recurrences of the form: T(n)=aT(nb)+O(nd)T(n) = aT\left(\frac{n}{b}\right) + O(n^d)T(n)=aT(bn)+O(nd)

4. Summations:

Many algorithms involve summing over a range of values, such as summing the work done at each level of a recursive algorithm.

Algebraic methods can help simplify such sums.

For example, the sum of the first nnn integers is given by: ∑i=1ni=n(n+1)2\sum_{i=1}^{n} i = \frac{n(n+1)}{2}i=1∑ni=2n(n+1)

5. Polynomials and Exponentials:

In complexity analysis, it's common to deal with polynomials (e.g., O(n2)O(n^2)O(n2)) and exponential functions (e.g., O(2n)O(2^n)O(2n)).

Algebraic computation helps simplify and compare such functions in terms of growth rates.

For example, O(n2)O(n^2)O(n2) grows slower than O(2n)O(2^n)O(2n) as nnn increases.

6. Matrix Multiplication:

Some algorithms, such as those related to dynamic programming or solving systems of linear equations, involve matrix multiplication.

Algebraic computation in this context involves understanding the cost of matrix operations, which often requires handling polynomials, logarithms, and powers of matrices.

7. Generating Functions:

Generating functions are often used in combinatorial algorithms to solve recurrence relations, particularly in counting problems.

For example, generating functions can help in solving recurrences that arise in dynamic programming.

Example: Solving a Recurrence Using the Master Theorem

Consider the recurrence:

T(n)=3T(n2)+O(n)T(n) = 3T\left(\frac{n}{2}\right) + O(n)T(n)=3T(2n)+O(n)

This recurrence follows the form T(n)=aT(n/b)+O(nd)T(n) = aT(n/b) + O(n^d)T(n)=aT(n/b)+O(nd) where a=3a = 3a=3, b=2b = 2b=2, and d=1d = 1d=1.

To apply the master theorem:

Compute log⁡ba=log⁡23≈1.585\log_b{a} = \log_2{3} \approx 1.585logba=log23≈1.585.

Compare ddd with log⁡ba\log_b{a}logba:

If d<log⁡bad < \log_b{a}d<logba, the complexity is O(nlog⁡ba)O(n^{\log_b{a}})O(nlogba).

If d=log⁡bad = \log_b{a}d=logba, the complexity is O(ndlog⁡n)O(n^d \log n)O(ndlogn).

If d>log⁡bad > \log_b{a}d>logba, the complexity is O(nd)O(n^d)O(nd).

Since 1<1.5851 < 1.5851<1.585, the complexity is O(nlog⁡23)≈O(n1.585)O(n^{\log_2{3}}) \approx O(n^{1.585})O(nlog23)≈O(n1.585).

This type of algebraic computation allows you to derive the time complexity of recursive algorithms efficiently.

Fast Fourier Transformation for polynomial multiplication

Given two polynomial A(x) and B(x), find the product C(x) = A(x)*B(x). There is already an O(n^2 ) naive approach to solve this problem.

here. This approach uses the coefficient form of the polynomial to calculate the product.

A coefficient representation of a polynomial A(x)=\sum_{j=0}^{n-1}a_jx^j is a = a0, a1, …, an-1.

Example-

A(x) = 6x^3 + 7x^2 - 10x + 9

B(x) = -2x^3 + 4x - 5

Coefficient representation of A(x) = (9, -10, 7, 6)

Coefficient representation of B(x) = (-5, 4, 0, -2)

Input :

A[] = {9, -10, 7, 6}

B[] = {-5, 4, 0, -2}

Output :

-12x^6 - 14x^5 + 44x^4 - 20x^3 -75x^2 + 86x - 45

We can do better, if we represent the polynomial in another form.

yes

Idea is to represent polynomial in point-value form and then compute the product. A point-value representation of a polynomial A(x) of degree-bound n is a set of n point-value pairs is{ (x0, y0), (x1, y1), …, (xn-1, yn-1)} such that all of the xi are distinct and yi = A(xi) for i = 0, 1, …, n-1.

Example

A(x) = x^3 - 2x + 1 xi -- 0, 1, 2, 3 A(xi) -- 1, 0, 5, 22

Point-value representation of above polynomial is { (0, 1), (1, 0), (2, 5), (3, 22) }. Using Horner’s method, (discussed here), n-point evaluation takes time O(n^2 ). It’s just calculation of values of A(x) at some x for n different points, so time complexity is O(n^2 ). Now that the polynomial is converted into point value, it can be easily calculated C(x) = A(x)*B(x) again using horner’s method. This takes O(n) time. An important point here is C(x) has degree bound 2n, then n points will give only n points of C(x), so for that case we need 2n different values of x to calculate 2n different values of y. Now that the product is calculated, the answer can to be converted back into coefficient vector form. To get back to coefficient vector form we use inverse of this evaluation. The inverse of evaluation is called interpolation. Interpolation using Lagrange’s formula gives point value-form to coefficient vector form of the polynomial.Lagrange’s formula is –

A(x) = \sum^{n-1}_{k=0} y_{k} \frac{\prod _{j\neq k}(x-x_{j})}{\prod _{j\neq k}(x_{k}-x_{j})}

So far we discussed,

What we understand so far !

This idea still solves the problem in O(n^2 ) time complexity. We can use any points we want as evaluation points, but by choosing the evaluation points carefully, we can convert between representations in only O(n log n) time. If we choose “complex roots of unity” as the evaluation points, we can produce a point-value representation by taking the discrete Fourier transform (DFT) of a coefficient vector. We can perform the inverse operation, interpolation, by taking the “inverse DFT” of point-value pairs, yielding a coefficient vector. Fast Fourier Transform (FFT) can perform DFT and inverse DFT in time O(nlogn).

Ø DFT

DFT is evaluating values of polynomial at n complex nth roots of unity \omega ^{0}_{n},\omega ^{1}_{n},\omega ^{2}_{n}......\omega ^{n-1}_{n} . So, for y_{k}=\omega ^{k}_{n} k = 0, 1, 2, …, n-1, y = (y0, y1, y2, …, yn-1) is Discrete fourier Transformation (DFT) of given polynomial.

The product of two polynomials of degree-bound n is a polynomial of degree-bound 2n. Before evaluating the input polynomials A and B, therefore, we first double their degree-bounds to 2n by adding n high-order coefficients of 0. Because the vectors have 2n elements, we use “complex 2nth roots of unity, ” which are denoted by the W2n (omega 2n). We assume that n is a power of 2; we can always meet this requirement by adding high-order zero coefficients.

Ø FFT

Here is the Divide-and-conquer strategy to solve this problem.

Define two new polynomials of degree-bound n/2, using even-index and odd-index coefficients of A(x) separately

A0(x) = a0 + a2x + a4x^2 + ... + an-2x^n/2-1.

A1(x) = a1 + a3x + a5x^2 + ... + an-1x^n/2-1.

A(x) = A0(x^2) + xA1(x^2)

The problem of evaluating A(x) at \omega ^{0}_{n},\omega ^{1}_{n},\omega ^{2}_{n}......\omega ^{n-1}_{n} reduces to evaluating the degree-bound n/2 polynomials A0(x) and A1(x) at the points

(\omega ^{0}_{n})^{2},(\omega ^{1}_{n})^{2},......(\omega ^{n-1}_{n})^{2}

Now combining the results by A(x) = A0(x^2) + xA1(x^2)

The list (\omega ^{0}_{n})^{2},(\omega ^{1}_{n})^{2},......(\omega ^{n-1}_{n})^{2} does not contain n distinct values, but n/2 complex n/2th roots of unity. Polynomials A0 and A1 are recursively evaluated at the n complex nth roots of unity. Subproblems have exactly the same form as the original problem, but are half the size. So recurrence formed is T(n) = 2T(n/2) + O(n), i.e complexity O(nlogn).

Algorithm

1. Add n higher-order zero coefficients to A(x) and B(x)

2. Evaluate A(x) and B(x) using FFT for 2n points

3. Pointwise multiplication of point-value forms

4. Interpolate C(x) using FFT to compute inverse DFT

String Matching Introduction

String Matching Algorithm is also called "String Searching Algorithm." This is a vital class of string algorithm is declared as "this is the method to find a place where one is several strings are found within the larger string."

Given a text array, T [1.....n], of n character and a pattern array, P [1......m], of m characters. The problems are to find an integer s, called valid shift where 0 ≤ s < n-m and T [s+1......s+m] = P [1......m]. In other words, to find even if P in T, i.e., where P is a substring of T. The item of P and T are character drawn from some finite alphabet such as {0, 1} or {A, B .....Z, a, b..... z}.

Given a string T [1......n], the substrings are represented as T [i......j] for some 0≤i ≤ j≤n-1, the string formed by the characters in T from index i to index j, inclusive. This process that a string is a substring of itself (take i = 0 and j =m).

The proper substring of string T [1......n] is T [1......j] for some 0<i ≤ j≤n-1. That is, we must have either i>0 or j < m-1.

Using these descriptions, we can say given any string T [1......n], the substrings are

1. T [i.....j] = T [i] T [i +1] T [i+2]......T [j] for some 0≤i ≤ j≤n-1.

And proper substrings are

1. T [i.....j] = T [i] T [i +1] T [i+2]......T [j] for some 0≤i ≤ j≤n-1.

Note: If i>j, then T [i.....j] is equal to the empty string or null, which has length zero.

Algorithms used for String Matching:

There are different types of method is used to finding the string

1. The Naive String Matching Algorithm

2. The Rabin-Karp-Algorithm

3. Finite Automata

4. The Knuth-Morris-Pratt Algorithm

5. The Boyer-Moore Algorithm

1. The Naive String Matching Algorithm

The naive approach tests all the possible placement of Pattern P [1.......m] relative to text T [1......n]. We try shift s = 0, 1.......n-m, successively and for each shift s. Compare T [s+1.......s+m] to P [1......m].

The naive algorithm finds all valid shifts using a loop that checks the condition P [1.......m] = T [s+1.......s+m] for each of the n - m +1 possible value of s.

NAIVE-STRING-MATCHER (T, P)

1. n ← length [T]

2. m ← length [P]

3. for s ← 0 to n -m

4. do if P [1.....m] = T [s + 1....s + m]

5. then print "Pattern occurs with shift" s

Analysis: This for loop from 3 to 5 executes for n-m + 1(we need at least m characters at the end) times and in iteration we are doing m comparisons. So the total complexity is O (n-m+1).

Example:

1. Suppose T = 1011101110

2. P = 111

3. Find all the Valid Shift

Solution:

Naive String Matching Algorithm

2. The Rabin-Karp Algorithm

The Rabin-Karp Algorithm is a string-searching algorithm named after its authors, Michael O. Rabin and Richard M. Karp.

So, let's get started.

Understanding the Rabin-Karp Algorithm

The Rabin-Karp Algorithm is an algorithm utilized for searching/matching patterns in the text with the help of a hash function. Unlike the Naïve String-Matching algorithm, it doesn't traverse through every character in the initial phase. It filters the characters that do not match and then performs the comparison.

A hash function is a utility to map a larger input value to a smaller output value. This output value is known as the Hash value.

The Rabin-Karp Algorithm is utilized to find all the occurrences of a provided pattern 'P' in a given string 'S' in O(Ns + Np) time, where 'Ns' and 'Np' are the lengths of 'S' and 'P', respectively.

Let us consider the following example to understand it in a better way.

Suppose a string S = "pabctxjabcjfsabcd" and pattern P = "abc". We are required to find all the occurrences of 'P' in 'S'.

We can see that "abc" is occurring in "pabctxjabcjfsabcd" at three positions. Thus, we will return that the pattern 'P' is occurring in string 'S' at indices 1, 7, and 13.

Understanding the Working of the Rabin-Karp Algorithm

A sequence of the characters is taken and checked for the possibility of the presence of the given string. If the possibility is found, then character matching is performed.

Let us consider the following steps to understand the algorithm:

Step 1: Suppose that the given text string is:

Figure 1: The Given Text String

and the pattern to be searched in the above text string is:

Figure 2: The Pattern

Step 2: Let us start by assigning a numerical value (v)/weight for the characters we will use in the problem. Here, we have selected the first ten alphabets only (i.e., A to J).

Figure 3: Text Weights

Step 3: Let x be the length of the pattern, and y be the length of the text string. Here, we have y = 10 and x = 3. Moreover, let n be the number of characters present in the input set. Since we have taken input set as {A, B, C, D, E, F, G, H, I, J}. Therefore, n will be equal to 10. We can assume any preferable value for n.

Step 4: Let us now start calculating the hash value of the pattern:

Rabin-Karp-Algorithm

Figure 4: Hash Value of the Pattern

Hash Value of Pattern (P) = Σ(v * ny - 1) mod 13

= ((3 * 10²) + (4 * 101) + (4 * 10⁰)) mod 13

= 344 mod 13

= 6

Note: In the above calculation, one can select a prime number in such a manner that we can perform all the calculations with single-precision arithmetic. Here, we have selected 13 as the prime number. We will discuss the purpose of calculating the modulus later.

Step 5: We will now calculate the hash value for the text window of size y.

For the first window ABC,

Hash Value of text string (s) = Σ(v * nx - 1) mod 13

= ((1 * 10²) + (2 * 101) + (3 * 10⁰)) mod 13

= 123 mod 13

= 6

Step 6: We will now compare the hash value of the pattern with the hash value of the text string. If they match, we will perform the character-matching operation.

In the above examples, the hash value of the first window (i.e., s) matches with the hash value of the pattern (i.e., P). Therefore, we will start matching the characters between ABC and CDD. Since the pattern does not match the first window, we will move to the next window.

Step 7: We will calculate the hash value of the next window by subtracting the first term and adding the next term, as shown below:

s = ((1 * 10²) + ((2 * 10¹) + (3 * 10⁰)) * 10 + (3 * 10⁰)) mod 13

= 233 mod 13

= 12

We will use the former hash value in the following way in order to optimize this process:

s = ((n * (t - v[character to be eliminated] * h) + v[character to be included]) mod 13

= ((10 * (6 - 1 * 9) + 3) mod 13

= 12

Where, h = n^{y - 1} = 10^{3
- 1} = 100

Step 8: For BCC, the hash value, s, will become 12, which is not equal to the hash value of the pattern, P, i.e., 6. Therefore, we will move to the next window. After performing some searches, we will manage to find the match for the window CDD in the text string.

3. String Matching with Finite Automata

String matching algorithms are techniques used in computer science and data analysis to find the occurrence or position of one string (the "pattern") within another string (the "text").

String Matching with Finite Automata

String Matching Algorithms

Brute-Force Algorithm: This is the simplest method, where you compare the pattern with every substring of the text.
Knuth-Morris-Pratt (KMP) Algorithm: KMP is more efficient than brute force. It precompiles a longest prefix-suffix array for the pattern to avoid unnecessary comparisons, reducing the overall time complexity to O(m + n).
Boyer-Moore Algorithm: Boyer-Moore uses a "bad character" and "good suffix" heuristic to skip sections of the text, making it very efficient in practice, especially for long texts and patterns.
Rabin-Karp Algorithm: Rabin-Karp uses hashing to quickly compare the pattern with potential substrings of the text. It has a time complexity of O(m * n) in the worst case but can be very efficient for certain inputs.
Aho-Corasick Algorithm: This algorithm is used for searching multiple patterns simultaneously in a text. It constructs a finite automaton and efficiently identifies all occurrences of the patterns.
Regular Expressions: Many programming languages provide regular expression libraries for string matching. Regular expressions allow for complex pattern matching based on a specified pattern language.

4. The Knuth Morris Pratt (KMP) Algorithm

Finding and locating specific patterns or substrings within a text or a larger string is one of the fundamental tools used in computer science. These functionalities are useful as to find a particular keyword or text within a larger text as doing the same manually would be a tedious task.

Knuth-Morris-Pratt Algorithm

What is String-searching Algorithms?

String-searching algorithms are a way to efficiently look for a given substring or a specific pattern within a longer string. These patterns could be a single character, simple words, complex words or even phrases.

Importance in Computer Science

Text Processing: String-searching algorithms are at the core of text processing tasks, such as searching for keywords in documents, parsing data, and extracting relevant information from textual data sources.
Data Retrieval: In databases, string-searching is essential for querying information efficiently. It enables the retrieval of records that contain specific data patterns, improving the speed of data retrieval.
Compiler Design: Compilers use string-searching algorithms to recognize and parse programming language constructs within source code, aiding in the compilation process.
Bioinformatics: In bioinformatics, these algorithms help identify genetic sequences, patterns in DNA, and protein motifs, facilitating important research in genomics and proteomics.
Network Security: String-searching is crucial in intrusion detection systems, where patterns of malicious code or network behaviors need to be identified rapidly.
Natural Language Processing (NLP): NLP applications rely on string-searching to perform tasks like sentiment analysis, named entity recognition, and machine translation.
Web Search Engines: Search engines use advanced string-searching techniques to match user queries to web pages, providing relevant search results.

Significance

Efficiency: String-searching algorithms are designed for efficiency. They aim to minimize the time and resources required to locate patterns within large datasets. Efficient algorithms are critical for applications with strict performance requirements.
Algorithmic Diversity: There is a wide variety of string-searching algorithms, each designed for specific scenarios and data characteristics.
Problem-Solving: String-searching algorithms exemplify algorithmic problem-solving, an essential skill in computer science. They provide insight into techniques for optimizing data retrieval and pattern-matching tasks.
Interdisciplinary Impact: String-searching algorithms have applications in numerous interdisciplinary fields, fostering collaboration between computer scientists and researchers in biology, linguistics, and more.
Continual Advancements: The field of string-searching algorithms continues to evolve, with researchers developing new algorithms and optimizations to address emerging challenges in data processing.

5. The Boyer-Moore Algorithm

Robert Boyer and J Strother Moore established it in 1977. The B-M String search algorithm is a particularly efficient algorithm and has served as a standard benchmark for string search algorithm ever since.

The B-M algorithm takes a 'backward' approach: the pattern string (P) is aligned with the start of the text string (T), and then compares the characters of a pattern from right to left, beginning with rightmost character.

If a character is compared that is not within the pattern, no match can be found by analyzing any further aspects at this position so the pattern can be changed entirely past the mismatching character.

For deciding the possible shifts, B-M algorithm uses two preprocessing strategies simultaneously. Whenever a mismatch occurs, the algorithm calculates a variation using both approaches and selects the more significant shift thus, if make use of the most effective strategy for each case.

The two strategies are called heuristics of B - M as they are used to reduce the search. They are:

1. Bad Character Heuristics

2. Good Suffix Heuristics

1. Bad Character Heuristics

This Heuristics has two implications:

Suppose there is a character in a text in which does not occur in a pattern at all. When a mismatch happens at this character (called as bad character), the whole pattern can be changed, begin matching form substring next to this 'bad character.'
On the other hand, it might be that a bad character is present in the pattern, in this case, align the nature of the pattern with a bad character in the text.

Thus in any case shift may be higher than one.

Example1: Let Text T = <nyoo nyoo> and pattern P = <noyo>

The Boyer-Moore Algorithm

Example2: If a bad character doesn't exist the pattern then.

The Boyer-Moore Algorithm

Problem in Bad-Character Heuristics:

In some cases, Bad-Character Heuristics produces some negative shifts.

For Example:

The Boyer-Moore Algorithm

This means that we need some extra information to produce a shift on encountering a bad character. This information is about the last position of every aspect in the pattern and also the set of characters used in a pattern (often called the alphabet ∑of a pattern).

COMPUTE-LAST-OCCURRENCE-FUNCTION (P, m, ∑ )

1. for each character a ∈ ∑

2. do λ [a] = 0

3. for j ← 1 to m

4. do λ [P [j]] ← j

5. Return λ

2. Good Suffix Heuristics:

A good suffix is a suffix that has matched successfully. After a mismatch which has a negative shift in bad character heuristics, look if a substring of pattern matched till bad character has a good suffix in it, if it is so then we have an onward jump equal to the length of suffix found.

Example:

The Boyer-Moore Algorithm

COMPUTE-GOOD-SUFFIX-FUNCTION (P, m)

1. Π ← COMPUTE-PREFIX-FUNCTION (P)

2. P'← reverse (P)

3. Π'← COMPUTE-PREFIX-FUNCTION (P')

4. for j ← 0 to m

5. do ɣ [j] ← m - Π [m]

6. for l ← 1 to m

7. do j ← m - Π' [L]

8. If ɣ [j] > l - Π' [L]

9. then ɣ [j] ← 1 - Π'[L]

10. Return ɣ

BOYER-MOORE-MATCHER (T, P, ∑)

1. n ←length [T]

2. m ←length [P]

3. λ← COMPUTE-LAST-OCCURRENCE-FUNCTION (P, m, ∑ )

4. ɣ← COMPUTE-GOOD-SUFFIX-FUNCTION (P, m)

5. s ←0

6. While s ≤ n - m

7. do j ← m

8. While j > 0 and P [j] = T [s + j]

9. do j ←j-1

10. If j = 0

11. then print "Pattern occurs at shift" s

12. s ← s + ɣ[0]

13. else s ← s + max (ɣ [j], j - λ[T[s+j]])

NP-Completeness

Introduction

NP-completeness is a fundamental concept in theoretical computing that addresses the inherent difficulty of certain computational problems. The abbreviation "NP" stands for non-deterministic polynomial time, which is a class of decision problems that can be solved in polynomial time by a non-deterministic Turing machine. NP-completeness provides the ability to identify and classify problems that are believed to be inherently hard in the sense that efficient algorithms for solving them may not exist. Stephen Cook introduced this concept in 1971 when he showed the existence of problems with a certain degree of complexity. The cornerstone of NP-completeness is the notion of a "hard" problem of the NP category. A decision problem is said to be NP-complete if it satisfies two critical conditions: Being in NP: solutions to the problem can be quickly checked using a deterministic polynomial time algorithm. If someone proposes a solution, its correctness can be verified relatively efficiently. Difficulty: The problem is at least as difficult as the hardest problems in NP. If there is an efficient algorithm for every NP-complete problem, then there is an efficient algorithm for all problems in NP. This is known as the Cook-Levin theorem, which proved the NP-completeness of the Boolean satisfiability problem (SAT). The importance of NP-completeness lies in its effect on algorithmic efficiency. If an efficient algorithm could be found for every NP-complete problem, this would imply the existence of efficient algorithms for all NP problems.

Basically, the question is whether every problem that can be solved quickly can also be solved quickly. The identification of new NP-complete problems has been a critical area of research, and many well-known problems in various fields, including graph theory, optimization, and planning, have been shown to be NP-complete. NP-completeness has far-reaching consequences for cryptography, optimization, and understanding the limits of computing. Since my last data update in September 2021, P vs. The NP problem remains one of the most important unsolved problems in computer science.

NP-Completeness

History of NP-Perfection

The history of NP-completeness is closely related to the development of computational complexity theory and the study of the inherent difficulty of computational problems. Here is a brief overview of the major milestones in the history of NP-completeness: Boolean Satisfiability Problem (SAT): The concept of NP-completeness was introduced by Stephen Cook in his landmark 1971 article, The Complexity of Theorem-Proving Procedures (The Complexity of Theorem-Proving Procedures). Cook showed the existence of the NP-complete problem by establishing the NP-completeness of Boolean satisfiability. problem (SAT). Cook's result was groundbreaking and laid the foundation for the theory of NP-completeness. Cooke-Levin theorem: Stephen Cook's paper presented the Cook-Levin theorem, which shows that the Boolean satisfiability problem is NP-complete. This theorem is fundamental to NP-completeness and is the starting point for proving NP-completeness of other problems. Box 21 NP-complete problems: In 1972, Richard Karp published the seminal paper "Reducibility Among Combinatorial Problems", in which he identified 21 distinct combinatorial problems that are NP-complete. This work showed that the computational difficulty of many problems in different fields, including graph theory, set theory, and optimization, is the same. NP-completeness proofs for various problems: After Karp's original set of 21 problems, researchers continued to prove the NP-completeness of additional problems in various fields. This involved problems with scheduling, chart coloring, and the traveling salesman. Developing categories of complexity: As researchers explored the landscape of computational complexity, additional categories of complexity such as P (polynomial time), NP (nondeterministic polynomial time), and co-NP were defined. The problem P vs. NP, which asks whether P is equal to NP, became a central question in the field and remains unresolved. Cook's theorem and Levin's work: Cook's early work was influential, and Leonid Levin independently developed similar ideas around the same time. Levin's work contributed to the understanding of NP-completeness and complexity theory. Contributions from other researchers: Many other researchers have made important contributions to the field of NP-completeness, including Michael Sipser, Neil Immerman, and others who have extended the theory to concepts such as polynomial-time hierarchy and space complexity. Practical implications: Although NP-completeness implies theoretical hardness, it also has practical implications. Many real-world problems believed to be NP-complete exhibit similar computational challenges. This led to the development of approximation algorithms and heuristics to solve these problems in practice. The history of NP-completeness is characterized by the study of the nature of computation, the limits of efficient algorithms, and the classification of problems based on their inherent difficulty. P vs. The NP problem remains one of the most important open questions in computer science.

The NP-completeness Problem

The notion of NP-completeness is a central theme in computational complexity theory. It refers to a class of problems that are both in the NP (nondeterministic polynomial time) complexity class and, informally speaking, as hard as the hardest NP problems. A problem is NP-complete if it is at least as hard as the NP-hardest problems. Here are some key points and problems with NP-completeness: Definition of NP-completeness: A decision problem is in NP if it can be solved quickly (in polynomial time). A problem is NP-complete if it is in NP and any problem in NP can be reduced to it in polynomial time. Cook's Theorem: Cook's theorem, proved by Stephen Cook in 1971, was the first to establish the notion of NP-completeness. This showed that the Boolean satisfiability problem (SAT) is NP-complete. Transitivity of polynomial time reduction: If problem A can be reduced to problem B in polynomial time and B is NP-complete, then A is also NP-complete. This is the basis for proving the NP-completeness of many problems. Consequences of NP-completeness: If any NP-complete problem can be solved in polynomial time, then every problem in NP can be solved in polynomial time. This is called the P vs. NP, which is one of the most famous open problems in computer science. Common NP-Complete problems: Many well-known problems are NP-complete, including the traveling salesman problem, the knapsack problem, and the clique problem. Practical implications: NP-completeness of a problem does not necessarily mean that it is difficult to solve in practice. This means that as the size of the input increases, the time required to solve the deterministic algorithm problem increases exponentially. Approximation algorithms: For many NP-complete problems, finding an exact solution in polynomial time is unlikely. Instead, approximation algorithms are often used to find optimal solutions in a reasonable amount of time. Applications: NP-complete problems have applications in many fields, including cryptography, optimization, planning, and logistics. Heuristic approaches: Since the exact solution of NP-complete problems is difficult, heuristic approaches and approximation algorithms are often used in practice to find solutions that are good enough for practical purposes. Understanding NP-completeness has important implications for the limits of efficient computing and has implications for the design of algorithms and problem-solving strategies in computing.

Solving NP-completeness problems

NP-complete problems are a difficult task because those problems are probably hard to solve efficiently (in polynomial time) by definition. However, there are several strategies and approaches that people use to solve NP-related problems:

1. Correct algorithms: Although NP-complete problems are generally difficult to solve exactly, cases of these problems can be solved efficiently. Exact algorithms may be feasible for small problems or special cases.

2. Approximation algorithms: Instead of striving for an exact solution, approximate algorithms provide solutions that are guaranteed to be close to the optimal solution. These algorithms often have polynomial time complexity and are suitable for practical applications.

3. Heuristic algorithms: Heuristic algorithms are designed to quickly find a good, although not necessarily the best, solution. They are often used for large NP-complete problems where finding an exact solution is impractical. Examples include genetic algorithms, simulated annealing, and local search algorithms.

4. Problem-specific techniques: Some NP-complete problems have special properties that can be used to develop special algorithms. Understanding the characteristics of a given problem can lead to the development of more efficient algorithms.

5. Parametric complexity: In parameterized complexity theory, the goal is to identify parameters that make a problem easily solvable. Although the problem is usually NP-complete, it can be solved in polynomial time if certain parameters are small.

6. Parallel and distributed computing: Some NP-complete problems can be solved more efficiently using parallel or distributed computing. Breaking down a large problem into smaller sub problems that can be solved simultaneously can speed up the overall solution.

7. Problem-Based Deductions: In some cases, problem-based reductions can be used to transform a single NP-complete problem instance into a problem instance with known properties. This can allow existing algorithms or heuristics to be used in other problems. Special equipment: Using specialized hardware such as GPUs or quantum computers can speed up certain types of NP-related problems.

8. Pretreatment and pruning: Preprocessing techniques can be used to simplify the problem before applying more complex algorithms. Pruning techniques can remove branches from the search space that are unlikely to lead to optimal solutions.

9. Metaheuristics: Metaheuristic approaches such as genetic algorithms, particle swarm optimization, and ant colony optimization are common problem-solving strategies that can be adapted to a variety of NP-related problems. It is important to note that the choice of approach depends on the specific NP-complete problem, the size of the problem instance, and the available resources. Many real-world applications use a combination of these techniques to find practical solutions to NP-related problems. However, it is important to understand that in certain cases and problem sizes, finding the optimal solution can remain computationally challenging.

Approximation Algorithms and Randomized Algorithms

Randomized algorithms are algorithms that use random numbers to determine what to do next at any point in their logic. This means that it uses randomization for operation in the logic. Due to this randomness, the time and space complexity of the standard algorithm is reduced.

In other words, randomized algorithms use random numbers to find solutions to problems and improve the performance of the algorithms. Such algorithms are categorized into the following two types:

1. Monte Carlo randomized algorithm: These randomized algorithms have a small probability that their output is incorrect.

2. Las Vegas algorithm: These randomized algorithms always give an accurate result or alert the user about its failure.

Types of randomized algorithm

Approximation algorithm

An approximation algorithm finds approximate solutions for optimization problems. They do not always guarantee to produce the best solution but they try to come close to the optimal value in polynomial time. Approximation algorithms guarantee to provide highly accurate solutions for optimization problems.

Randomized approximation algorithm

Approximation algorithms are efficient algorithms for solving optimization problems. However, when the concept of randomness is used in approximation algorithms, it enhances the algorithm’s performance.

Randomized approximation algorithms efficiently solve problems for which no deterministic algorithm has been identified. They are also used in the simulated annealing methods for solving optimization problems.

Randomized algorithms do random selections that vary depending on the input given to an algorithm to solve a problem. If a significant part of deterministic algorithms works well for the given input, a procedure to restart the algorithm after a particular time during its execution is determined, which helps speed up the algorithms.

Randomized approximation algorithm example (Max-Cut algorithm)

It is possible to prove that the max-cut problem produces 2-approximation using the greedy approximation algorithm. However, if randomization techniques are used in the max-cut problem, a shorter algorithm that gives the same performance and executes in polynomial time can be determined.

Random max-cut problem: Consider an undirected graph G = (V, E) and positive weight for every edge, select a random cut (partition V) in two subsets A, B such that it maximizes the combined weight of edges with one endpoint in A and the other in B. If less than m/2 edges are intersecting within the cut, repeat the process.

Theorem: A random-cut is a ½ approximation algorithm for a max-cut algorithm that executes in expected polynomial time.

Proof:

Consider X as a random variable representing the number of edges intersecting in the random cut. For i = 1, 2,..., m, Xi will be the indicator variable. If the ith edge intersects the random cut, the value of the indicator variable will be 1. Otherwise, it will be 0. So, X = ∑Xii=1m∑Xii=1m and by the linearity of expectation E[X] =∑E[Xi]i = 1mE[X] =∑E[Xi]i = 1m. Now, the probability for an edge {u, v} to lie in a randomly selected cut is ½, as the edges u and v will be placed in either of the two partitions. Hence, the chances of u and v being in one partition will be ½. Therefore, E[Xi] = ½ for every i. Hence, E[Xi] = m/2.

This proves that the random cut will give at least m/2 crossing edges. Now, the goal is to find a randomized algorithm that generates a good cut every time, and its execution time is a random variable with a polynomial-time execution. This can be done by computing the probability that X ≥ m/2X ≥ m/2 when a random cut is selected. In the worst case, ifX ≥m/2X ≥m/2, all the probability is weighted on m, and if X < m/2, all the probability is weighted on m/2 - 1. This makes the expectation of X the highest.

m/2 = E[X] ≤(1 − P r[X ≥ m/2] (m/2 − 1) + P r[X ≥ m/2]m.m/2 = E[X] ≤(1 - P r[X ≥ m/2] (m/2 - 1) + P r[X ≥ m/2]m.

By solving P r[X ≥ m/2]P r[X ≥ m/2], the solution will be at least 2/(m+2). This means the maximum number of total iterations in the algorithm will be (m+2) / 2. Hence proved that the algorithm executes within expected polynomial time and generates a random cut of size at m/2. In other terms, it can be marked that the expected weight of the generated cut is at least half the weight of the maximum cut.

Application of randomized approximation algorithm

One of the main applications of random approximation algorithms is randomized backtracking. Backtracking helps in solving combinatorial problems and it typically takes exponential time for execution. However, recent research proved that when the concept of randomness is involved in backtracking problems, they can be solved much quicker.

Backtracking problems typically contain small substructures in them. Each substructure has a property that the solution for an entire problem can be identified if anyone substructure is solved correctly. Due to this reason, these problems become tenable when solved using randomization techniques. The overall execution time that backtracking takes to generate a solution is reduced impressively by constantly restarting the backtracking technique after a polynomial time.

Search This Blog

ENGINEERING NOTES

UNIT-5 SELECTED TOPICS

1. The Naive String Matching Algorithm

Example:

4. The Knuth Morris Pratt (KMP) Algorithm

What is String-searching Algorithms?

Importance in Computer Science

Significance

5. The Boyer-Moore Algorithm

1. Bad Character Heuristics

Problem in Bad-Character Heuristics:

For Example:

2. Good Suffix Heuristics:

Example:

Approximation algorithm

Randomized approximation algorithm

Randomized approximation algorithm example (Max-Cut algorithm)

Application of randomized approximation algorithm

Comments

Post a Comment

Popular posts from this blog

Compiler Design UNIT-1

COA- Unit -5 Peripheral DeviceS

COA-UNIT-3 Control Unit