UNIT-5 SELECTED TOPICS
UNIT-5 SELECTED TOPICS
Algebraic computation
Algebraic computation in the context of Design and
Analysis of Algorithms (DAA) typically involves applying algebraic techniques
to analyze the performance of algorithms, particularly in terms of time complexity,
space complexity, and other resource requirements.
Here are some key aspects of algebraic computation
in DAA:
1. Big-O Notation:
Big-O notation is
used to describe the asymptotic upper bound of an algorithm's complexity. It
expresses how the runtime or space requirements grow as the size of the input
increases.
Algebraic computation involves finding a function
f(n)f(n)f(n) that bounds the behavior of the algorithm as a function of input
size nnn.
2. Recurrence Relations:
Recurrence relations often arise when analyzing
recursive algorithms.
For example, the time complexity of a
divide-and-conquer algorithm like Merge Sort is expressed by the recurrence
relation: T(n)=2T(n2)+O(n)T(n) = 2T\left(\frac{n}{2}\right) +
O(n)T(n)=2T(2n)+O(n)
Solving such recurrences involves algebraic
techniques such as substitution, iteration, or using the master theorem.
3. Solving Recurrences:
Substitution Method: Guess
the form of the solution and prove it by induction.
Iteration Method:
Unfold the recurrence step by step and sum the results.
Master Theorem:
Provides a straightforward way to analyze divide-and-conquer recurrences of the
form: T(n)=aT(nb)+O(nd)T(n) = aT\left(\frac{n}{b}\right) +
O(n^d)T(n)=aT(bn)+O(nd)
4. Summations:
Many algorithms involve summing over a range of
values, such as summing the work done at each level of a recursive algorithm.
Algebraic methods can help simplify such sums.
For example, the sum of the first nnn integers is
given by: ∑i=1ni=n(n+1)2\sum_{i=1}^{n} i = \frac{n(n+1)}{2}i=1∑ni=2n(n+1)
5. Polynomials and Exponentials:
In complexity analysis, it's common to deal with
polynomials (e.g., O(n2)O(n^2)O(n2)) and exponential functions (e.g.,
O(2n)O(2^n)O(2n)).
Algebraic computation helps simplify and compare
such functions in terms of growth rates.
For example, O(n2)O(n^2)O(n2) grows slower than
O(2n)O(2^n)O(2n) as nnn increases.
6. Matrix Multiplication:
Some algorithms, such as those related to dynamic
programming or solving systems of linear equations, involve matrix
multiplication.
Algebraic computation in this context involves
understanding the cost of matrix operations, which often requires handling
polynomials, logarithms, and powers of matrices.
7. Generating Functions:
Generating functions are often used in combinatorial
algorithms to solve recurrence relations, particularly in counting problems.
For example, generating functions can help in
solving recurrences that arise in dynamic programming.
Example: Solving a Recurrence Using the Master
Theorem
Consider the recurrence:
T(n)=3T(n2)+O(n)T(n) = 3T\left(\frac{n}{2}\right) +
O(n)T(n)=3T(2n)+O(n)
This recurrence follows the form
T(n)=aT(n/b)+O(nd)T(n) = aT(n/b) + O(n^d)T(n)=aT(n/b)+O(nd) where a=3a = 3a=3,
b=2b = 2b=2, and d=1d = 1d=1.
To apply the master theorem:
Compute logba=log23≈1.585\log_b{a} = \log_2{3}
\approx 1.585logba=log23≈1.585.
Compare ddd with logba\log_b{a}logba:
If d<logbad < \log_b{a}d<logba, the
complexity is O(nlogba)O(n^{\log_b{a}})O(nlogba).
If d=logbad = \log_b{a}d=logba, the complexity is
O(ndlogn)O(n^d \log n)O(ndlogn).
If d>logbad > \log_b{a}d>logba, the
complexity is O(nd)O(n^d)O(nd).
Since 1<1.5851 < 1.5851<1.585, the
complexity is O(nlog23)≈O(n1.585)O(n^{\log_2{3}}) \approx
O(n^{1.585})O(nlog23)≈O(n1.585).
This type of algebraic computation allows you to
derive the time complexity of recursive algorithms efficiently.
Fast Fourier Transformation for polynomial
multiplication
Given two polynomial A(x) and B(x), find the product
C(x) = A(x)*B(x). There is already an O(n^2        ) naive approach to solve this problem.
here. This approach uses the coefficient form of the
polynomial to calculate the product.
A coefficient representation of a polynomial
A(x)=\sum_{j=0}^{n-1}a_jx^j        is a =
a0, a1, …, an-1.
Example-
A(x) = 6x^3 + 7x^2 - 10x + 9        
B(x) = -2x^3 + 4x - 5        
Coefficient representation of A(x) = (9, -10, 7, 6)
Coefficient representation of B(x) = (-5, 4, 0, -2)
Input :
 A[] = {9,
-10, 7, 6}
 B[] = {-5, 4,
0, -2}
Output : 
-12x^6 - 14x^5 + 44x^4 - 20x^3 -75x^2 + 86x - 45
We can do better, if we represent the polynomial in
another form.
yes
Idea is to represent polynomial in point-value form
and then compute the product. A point-value representation of a polynomial A(x)
of degree-bound n is a set of n point-value pairs is{ (x0, y0), (x1, y1), …,
(xn-1, yn-1)} such that all of the xi are distinct and yi = A(xi) for i = 0, 1,
…, n-1. 
Example
 A(x) = x^3 -
2x + 1 xi    -- 0, 1, 2, 3 A(xi) -- 1, 0,
5, 22
Point-value representation of above polynomial is {
(0, 1), (1, 0), (2, 5), (3, 22) }. Using Horner’s method, (discussed here),
n-point evaluation takes time O(n^2       
). It’s just calculation of values of A(x) at some x for n different
points, so time complexity is O(n^2       
). Now that the polynomial is converted into point value, it can be
easily calculated C(x) = A(x)*B(x) again using horner’s method. This takes O(n)
time. An important point here is C(x) has degree bound 2n, then n points will
give only n points of C(x), so for that case we need 2n different values of x
to calculate 2n different values of y. Now that the product is calculated, the
answer can to be converted back into coefficient vector form. To get back to
coefficient vector form we use inverse of this evaluation. The inverse of
evaluation is called interpolation. Interpolation using Lagrange’s formula
gives point value-form to coefficient vector form of the polynomial.Lagrange’s
formula is – 
A(x) = \sum^{n-1}_{k=0} y_{k} \frac{\prod _{j\neq
k}(x-x_{j})}{\prod _{j\neq k}(x_{k}-x_{j})}        
So far we discussed,
What we understand so far !
This idea still solves the problem in O(n^2        ) time complexity. We can use any
points we want as evaluation points, but by choosing the evaluation points
carefully, we can convert between representations in only O(n log n) time. If
we choose “complex roots of unity” as the evaluation points, we can produce a
point-value representation by taking the discrete Fourier transform (DFT) of a
coefficient vector. We can perform the inverse operation, interpolation, by
taking the “inverse DFT” of point-value pairs, yielding a coefficient vector. Fast
Fourier Transform (FFT) can perform DFT and inverse DFT in time O(nlogn).
Ø  DFT 
DFT is evaluating values of polynomial at n complex
nth roots of unity \omega ^{0}_{n},\omega ^{1}_{n},\omega ^{2}_{n}......\omega
^{n-1}_{n}        . So, for y_{k}=\omega
^{k}_{n}        k = 0, 1, 2, …, n-1, y =
(y0, y1, y2, …, yn-1) is Discrete fourier Transformation (DFT) of given
polynomial.
The product of two polynomials of degree-bound n is
a polynomial of degree-bound 2n. Before evaluating the input polynomials A and
B, therefore, we first double their degree-bounds to 2n by adding n high-order
coefficients of 0. Because the vectors have 2n elements, we use “complex 2nth
roots of unity, ” which are denoted by the W2n (omega 2n). We assume that n is
a power of 2; we can always meet this requirement by adding high-order zero
coefficients.
Ø  FFT
Here is the Divide-and-conquer strategy to solve
this problem.
Define two new polynomials of degree-bound n/2,
using even-index and odd-index coefficients of A(x) separately
A0(x) = a0 + a2x + a4x^2 + ... + an-2x^n/2-1.        
A1(x) = a1 + a3x + a5x^2 + ... + an-1x^n/2-1.        
A(x) = A0(x^2) + xA1(x^2)        
The problem of evaluating A(x) at \omega
^{0}_{n},\omega ^{1}_{n},\omega ^{2}_{n}......\omega ^{n-1}_{n}        reduces to evaluating the degree-bound
n/2 polynomials A0(x) and A1(x) at the points 
(\omega ^{0}_{n})^{2},(\omega
^{1}_{n})^{2},......(\omega ^{n-1}_{n})^{2}        
Now combining the results by A(x) = A0(x^2) +
xA1(x^2) 
The list (\omega ^{0}_{n})^{2},(\omega ^{1}_{n})^{2},......(\omega
^{n-1}_{n})^{2}        does not contain n
distinct values, but n/2 complex n/2th roots of unity. Polynomials A0 and A1
are recursively evaluated at the n complex nth roots of unity. Subproblems have
exactly the same form as the original problem, but are half the size. So
recurrence formed is T(n) = 2T(n/2) + O(n), i.e complexity O(nlogn).
Algorithm
1. Add n higher-order zero coefficients to A(x) and
B(x)
2. Evaluate A(x) and B(x) using FFT for 2n points
3. Pointwise multiplication of point-value forms
4. Interpolate C(x) using FFT to compute inverse DFT
String Matching
Introduction
String Matching Algorithm is also called "String Searching
Algorithm." This is a vital class of string algorithm is declared as
"this is the method to find a place where one is several strings are found
within the larger string."
Given a text array, T [1.....n], of n character and a pattern array, P
[1......m], of m characters. The problems are to find an integer s,
called valid shift where 0 ≤ s < n-m and T [s+1......s+m] =
P [1......m]. In other words, to find even if P in T, i.e., where P is a
substring of T. The item of P and T are character drawn from some finite
alphabet such as {0, 1} or {A, B .....Z, a, b..... z}.
Given a string T [1......n], the substrings are
represented as T [i......j] for some 0≤i ≤ j≤n-1, the string formed by the
characters in T from index i to index j, inclusive. This process that a string
is a substring of itself (take i = 0 and j =m).
The proper substring of string T [1......n] is T
[1......j] for some 0<i ≤ j≤n-1. That is, we must have either i>0 or j
< m-1.
Using these descriptions, we can say given any string T [1......n], the
substrings are
1.     
T [i.....j] = T [i] T [i +1] T [i+2]......T [j] for some 0≤i ≤ j≤n-1.  
And proper substrings are
1.     
T [i.....j] = T [i] T [i +1] T [i+2]......T [j] for some 0≤i ≤ j≤n-1.  
Note: If i>j, then T [i.....j] is equal to the
empty string or null, which has length zero.
Algorithms used for String Matching:
There are different types of method is used to finding the string
1.     
The Naive String Matching
Algorithm
2.     
The Rabin-Karp-Algorithm
3.     
Finite Automata
4.     
The Knuth-Morris-Pratt Algorithm
5.     
The Boyer-Moore Algorithm
1.     
The
Naive String Matching Algorithm
The naive approach tests all the possible
placement of Pattern P [1.......m] relative to text T [1......n]. We try shift
s = 0, 1.......n-m, successively and for each shift s. Compare T
[s+1.......s+m] to P [1......m].
The naive algorithm finds all valid
shifts using a loop that checks the condition P [1.......m] = T [s+1.......s+m]
for each of the n - m +1 possible value of s.
NAIVE-STRING-MATCHER (T, P)
1. n ← length [T]2. m ← length [P]3. for s ← 0 to n -m4. do if P [1.....m] = T [s + 1....s + m]5. then print "Pattern occurs with shift" s
Analysis: This for loop from 3 to 5 executes
for n-m + 1(we need at least m characters at the end) times and in iteration we
are doing m comparisons. So the total complexity is O (n-m+1).
Example:
1.     
Suppose T = 1011101110  
2.     
        P = 111  
3.     
       Find all the Valid Shift  
Solution:



2.     
The
Rabin-Karp Algorithm
The Rabin-Karp Algorithm is a string-searching algorithm
named after its authors, Michael O. Rabin and Richard M. Karp.
So, let's get started.
Understanding the Rabin-Karp Algorithm
The Rabin-Karp Algorithm is an algorithm utilized for searching/matching
patterns in the text with the help of a hash function. Unlike the Naïve
String-Matching algorithm, it doesn't traverse through every character in the
initial phase. It filters the characters that do not match and then performs
the comparison.
A hash function is a utility to map a larger input value to a smaller
output value. This output value is known as the Hash value.
The Rabin-Karp Algorithm is utilized to find all the occurrences of a
provided pattern 'P' in a given string 'S' in O(Ns + Np) time, where 'Ns' and
'Np' are the lengths of 'S' and 'P', respectively.
Let us consider the following example to understand it in a better way.
Suppose a string S = "pabctxjabcjfsabcd" and pattern P =
"abc". We are required to find all the occurrences of 'P' in 'S'.
We can see that "abc" is occurring in
"pabctxjabcjfsabcd" at three positions. Thus, we will return that the
pattern 'P' is occurring in string 'S' at indices 1, 7, and 13.
Understanding the Working of the Rabin-Karp
Algorithm
A sequence of the characters is taken and checked for the possibility of
the presence of the given string. If the possibility is found, then character
matching is performed.
Let us consider the following steps to understand the algorithm:
Step 1: Suppose that the given text
string is:

Figure 1: The Given Text String
and the pattern to be searched in the above text string is:

Figure 2: The Pattern
Step 2: Let us start by assigning a
numerical value (v)/weight for the characters we will use in the problem. Here,
we have selected the first ten alphabets only (i.e., A to J).

Figure 3: Text Weights
Step 3: Let x be the length of the
pattern, and y be the length of the text string. Here, we have y = 10 and x =
3. Moreover, let n be the number of characters present in the input set. Since
we have taken input set as {A, B, C, D, E, F, G, H, I, J}. Therefore, n will be
equal to 10. We can assume any preferable value for n.
Step 4: Let us now start
calculating the hash value of the pattern:

Figure 4: Hash Value of the Pattern
Hash Value of Pattern (P) = Σ(v *
ny - 1) mod 13
= ((3 * 102) + (4 *
101) + (4 * 100)) mod 13
= 344 mod 13
= 6
Note: In the
above calculation, one can select a prime number in such a manner that we can
perform all the calculations with single-precision arithmetic. Here, we have
selected 13 as the prime number. We will discuss the purpose of calculating the
modulus later.
Step 5: We will now calculate the
hash value for the text window of size y.
For the first window ABC,
Hash Value of text string (s) =
Σ(v * nx - 1) mod 13
= ((1 * 102) + (2 *
101) + (3 * 100)) mod 13
= 123 mod 13
= 6
Step 6: We will now compare the
hash value of the pattern with the hash value of the text string. If they
match, we will perform the character-matching operation.
In the above examples, the hash value of the first window (i.e., s) matches
with the hash value of the pattern (i.e., P). Therefore, we will start matching
the characters between ABC and CDD. Since the pattern does not match the first
window, we will move to the next window.
Advertisement
Step 7: We will calculate the hash
value of the next window by subtracting the first term and adding the next
term, as shown below:
s = ((1 * 102) + ((2 *
101) + (3 * 100)) * 10 + (3 * 100)) mod 13
= 233 mod 13
= 12
We will use the former hash value in the following way in order to optimize
this process:
s = ((n * (t - v[character to be
eliminated] * h) + v[character to be included]) mod 13
= ((10 * (6 - 1 * 9) + 3) mod 13
= 12
Where, h = ny - 1 = 103
- 1 = 100
Step 8: For BCC, the hash value, s,
will become 12, which is not equal to the hash value of the pattern, P, i.e.,
6. Therefore, we will move to the next window. After performing some searches,
we will manage to find the match for the window CDD in the text string.
3.     
String Matching with Finite Automata
String
matching algorithms are techniques used in computer science and data analysis
to find the occurrence or position of one string (the "pattern")
within another string (the "text").

          String Matching Algorithms
- Brute-Force
     Algorithm: This is the simplest method, where you compare the pattern with
     every substring of the text.
- Knuth-Morris-Pratt
     (KMP) Algorithm: KMP is more efficient than brute force. It precompiles a
     longest prefix-suffix array for the pattern to avoid unnecessary
     comparisons, reducing the overall time complexity to O(m + n).
- Boyer-Moore
     Algorithm: Boyer-Moore uses a "bad character" and "good
     suffix" heuristic to skip sections of the text, making it very
     efficient in practice, especially for long texts and patterns.
- Rabin-Karp
     Algorithm: Rabin-Karp uses hashing to quickly compare the pattern with
     potential substrings of the text. It has a time complexity of O(m * n) in
     the worst case but can be very efficient for certain inputs.
- Aho-Corasick
     Algorithm: This algorithm is used for searching multiple patterns
     simultaneously in a text. It constructs a finite automaton and efficiently
     identifies all occurrences of the patterns.
- Regular
     Expressions: Many programming languages provide regular expression
     libraries for string matching. Regular expressions allow for complex
     pattern matching based on a specified pattern language.
4.      The Knuth Morris Pratt (KMP)
Algorithm
Finding and locating specific patterns or
substrings within a text or a larger string is one of the fundamental tools
used in computer science. These functionalities are useful as to find a
particular keyword or text within a larger text as doing the same manually
would be a tedious task.

What
is String-searching Algorithms?
String-searching algorithms are a way to
efficiently look for a given substring or a specific pattern within a longer
string. These patterns could be a single character, simple words, complex words
or even phrases.
Importance
in Computer Science
- Text
     Processing: String-searching algorithms are at the core of text processing
     tasks, such as searching for keywords in documents, parsing data, and
     extracting relevant information from textual data sources.
- Data
     Retrieval: In databases, string-searching is essential for querying
     information efficiently. It enables the retrieval of records that contain
     specific data patterns, improving the speed of data retrieval.
- Compiler
     Design: Compilers use string-searching algorithms to recognize and parse
     programming language constructs within source code, aiding in the
     compilation process.
- Bioinformatics:
     In bioinformatics, these algorithms help identify genetic sequences,
     patterns in DNA, and protein motifs, facilitating important research in
     genomics and proteomics.
- Network
     Security: String-searching is crucial in intrusion detection systems,
     where patterns of malicious code or network behaviors need to be identified
     rapidly.
- Natural
     Language Processing (NLP): NLP applications rely on string-searching to
     perform tasks like sentiment analysis, named entity recognition, and
     machine translation.
- Web Search
     Engines: Search engines use advanced string-searching techniques to match
     user queries to web pages, providing relevant search results.
Significance
- Efficiency:
     String-searching algorithms are designed for efficiency. They aim to
     minimize the time and resources required to locate patterns within large
     datasets. Efficient algorithms are critical for applications with strict
     performance requirements.
- Algorithmic
     Diversity: There is a wide variety of string-searching algorithms, each
     designed for specific scenarios and data characteristics.
- Problem-Solving:
     String-searching algorithms exemplify algorithmic problem-solving, an
     essential skill in computer science. They provide insight into techniques
     for optimizing data retrieval and pattern-matching tasks.
- Interdisciplinary
     Impact: String-searching algorithms have applications in numerous
     interdisciplinary fields, fostering collaboration between computer
     scientists and researchers in biology, linguistics, and more.
- Continual
     Advancements: The field of string-searching algorithms continues to
     evolve, with researchers developing new algorithms and optimizations to
     address emerging challenges in data processing.
5.      The Boyer-Moore Algorithm
Robert Boyer and J Strother Moore
established it in 1977. The B-M String search algorithm is a particularly
efficient algorithm and has served as a standard benchmark for string search
algorithm ever since.
The B-M algorithm takes a 'backward'
approach: the pattern string (P) is aligned with the start of the text string
(T), and then compares the characters of a pattern from right to left,
beginning with rightmost character.
If a character is compared that is not
within the pattern, no match can be found by analyzing any further aspects at
this position so the pattern can be changed entirely past the mismatching
character.
For deciding the possible shifts, B-M
algorithm uses two preprocessing strategies simultaneously. Whenever a mismatch
occurs, the algorithm calculates a variation using both approaches and selects
the more significant shift thus, if make use of the most effective strategy for
each case.
The two strategies are called heuristics
of B - M as they are used to reduce the search. They are:
1.     
Bad Character Heuristics
2.     
Good Suffix Heuristics
1.
Bad Character Heuristics
This Heuristics has two implications:
- Suppose there is a character in a text in which
     does not occur in a pattern at all. When a mismatch happens at this
     character (called as bad character), the whole pattern can be changed,
     begin matching form substring next to this 'bad character.'
- On the other hand, it might be that a bad
     character is present in the pattern, in this case, align the nature of the
     pattern with a bad character in the text.
Thus in any case shift may be higher than
one.
Example1: Let Text T = <nyoo nyoo> and
pattern P = <noyo>

Example2: If a bad character doesn't exist
the pattern then.

Problem
in Bad-Character Heuristics:
In some cases, Bad-Character Heuristics
produces some negative shifts.
For
Example:

This means that we need some extra
information to produce a shift on encountering a bad character. This
information is about the last position of every aspect in the pattern and also
the set of characters used in a pattern (often called the alphabet ∑of a
pattern).
COMPUTE-LAST-OCCURRENCE-FUNCTION (P, m, ∑ )
1. for each character a ∈ ∑
2. do λ [a] = 03. for j ← 1 to m4. do λ [P [j]] ← j5. Return λ
2.
Good Suffix Heuristics:
A good suffix is a suffix that has
matched successfully. After a mismatch which has a negative shift in bad
character heuristics, look if a substring of pattern matched till bad character
has a good suffix in it, if it is so then we have an onward jump equal to the
length of suffix found.
Example:

COMPUTE-GOOD-SUFFIX-FUNCTION (P, m)
1. Π ← COMPUTE-PREFIX-FUNCTION (P)2. P'← reverse (P)3. Π'← COMPUTE-PREFIX-FUNCTION (P')4. for j ← 0 to m5. do ɣ [j] ← m - Π [m]6. for l ← 1 to m7. do j ← m - Π' [L]8. If ɣ [j] > l - Π' [L]9. then ɣ [j] ← 1 - Π'[L]10. Return ɣ
 
BOYER-MOORE-MATCHER (T, P, ∑)
1. n ←length [T]2. m ←length [P]3. λ← COMPUTE-LAST-OCCURRENCE-FUNCTION (P, m, ∑ )4. ɣ← COMPUTE-GOOD-SUFFIX-FUNCTION (P, m)5. s ←06. While s ≤ n - m7. do j ← m8. While j > 0 and P [j] = T [s + j]9. do j ←j-110. If j = 011. then print "Pattern occurs at shift" s12. s ← s + ɣ[0]13. else s ← s + max (ɣ [j], j - λ[T[s+j]])
NP-Completeness
Introduction
NP-completeness
is a fundamental concept in theoretical computing that addresses the inherent
difficulty of certain computational problems. The abbreviation "NP"
stands for non-deterministic polynomial time, which is a class of decision
problems that can be solved in polynomial time by a non-deterministic Turing
machine. NP-completeness provides the ability to identify and classify problems
that are believed to be inherently hard in the sense that efficient algorithms
for solving them may not exist. Stephen Cook introduced this concept in 1971
when he showed the existence of problems with a certain degree of complexity.
The cornerstone of NP-completeness is the notion of a "hard" problem
of the NP category. A decision problem is said to be NP-complete if it satisfies
two critical conditions: Being in NP: solutions to the problem can be quickly
checked using a deterministic polynomial time algorithm. If someone proposes a
solution, its correctness can be verified relatively efficiently. Difficulty:
The problem is at least as difficult as the hardest problems in NP. If there is
an efficient algorithm for every NP-complete problem, then there is an
efficient algorithm for all problems in NP. This is known as the Cook-Levin
theorem, which proved the NP-completeness of the Boolean satisfiability problem
(SAT). The importance of NP-completeness lies in its effect on algorithmic
efficiency. If an efficient algorithm could be found for every NP-complete
problem, this would imply the existence of efficient algorithms for all NP
problems.
Basically,
the question is whether every problem that can be solved quickly can also be
solved quickly. The identification of new NP-complete problems has been a
critical area of research, and many well-known problems in various fields, including
graph theory, optimization, and planning, have been shown to be NP-complete.
NP-completeness has far-reaching consequences for cryptography, optimization,
and understanding the limits of computing. Since my last data update in
September 2021, P vs. The NP problem remains one of the most important unsolved
problems in computer science.

History of NP-Perfection
The history
of NP-completeness is closely related to the development of computational
complexity theory and the study of the inherent difficulty of computational
problems. Here is a brief overview of the major milestones in the history of
NP-completeness: Boolean Satisfiability Problem (SAT): The concept of
NP-completeness was introduced by Stephen Cook in his landmark 1971 article,
The Complexity of Theorem-Proving Procedures (The Complexity of Theorem-Proving
Procedures). Cook showed the existence of the NP-complete problem by
establishing the NP-completeness of Boolean satisfiability. problem (SAT).
Cook's result was groundbreaking and laid the foundation for the theory of
NP-completeness. Cooke-Levin theorem: Stephen Cook's paper presented the
Cook-Levin theorem, which shows that the Boolean satisfiability problem is
NP-complete. This theorem is fundamental to NP-completeness and is the starting
point for proving NP-completeness of other problems. Box 21 NP-complete
problems: In 1972, Richard Karp published the seminal paper "Reducibility
Among Combinatorial Problems", in which he identified 21 distinct
combinatorial problems that are NP-complete. This work showed that the
computational difficulty of many problems in different fields, including graph
theory, set theory, and optimization, is the same. NP-completeness proofs for
various problems: After Karp's original set of 21 problems, researchers
continued to prove the NP-completeness of additional problems in various
fields. This involved problems with scheduling, chart coloring, and the
traveling salesman. Developing categories of complexity: As researchers
explored the landscape of computational complexity, additional categories of
complexity such as P (polynomial time), NP (nondeterministic polynomial time),
and co-NP were defined. The problem P vs. NP, which asks whether P is equal to
NP, became a central question in the field and remains unresolved. Cook's
theorem and Levin's work: Cook's early work was influential, and Leonid Levin
independently developed similar ideas around the same time. Levin's work
contributed to the understanding of NP-completeness and complexity theory.
Contributions from other researchers: Many other researchers have made
important contributions to the field of NP-completeness, including Michael
Sipser, Neil Immerman, and others who have extended the theory to concepts such
as polynomial-time hierarchy and space complexity. Practical implications:
Although NP-completeness implies theoretical hardness, it also has practical
implications. Many real-world problems believed to be NP-complete exhibit
similar computational challenges. This led to the development of approximation
algorithms and heuristics to solve these problems in practice. The history of
NP-completeness is characterized by the study of the nature of computation, the
limits of efficient algorithms, and the classification of problems based on
their inherent difficulty. P vs. The NP problem remains one of the most
important open questions in computer science.
The NP-completeness Problem
The notion
of NP-completeness is a central theme in computational complexity theory. It
refers to a class of problems that are both in the NP (nondeterministic
polynomial time) complexity class and, informally speaking, as hard as the
hardest NP problems. A problem is NP-complete if it is at least as hard as the
NP-hardest problems. Here are some key points and problems with NP-completeness:
Definition of NP-completeness: A decision problem is in NP if it can be solved
quickly (in polynomial time). A problem is NP-complete if it is in NP and any
problem in NP can be reduced to it in polynomial time. Cook's Theorem: Cook's
theorem, proved by Stephen Cook in 1971, was the first to establish the notion
of NP-completeness. This showed that the Boolean satisfiability problem (SAT)
is NP-complete. Transitivity of polynomial time reduction: If problem A can be
reduced to problem B in polynomial time and B is NP-complete, then A is also
NP-complete. This is the basis for proving the NP-completeness of many
problems. Consequences of NP-completeness: If any NP-complete problem can be
solved in polynomial time, then every problem in NP can be solved in polynomial
time. This is called the P vs. NP, which is one of the most famous open
problems in computer science. Common NP-Complete problems: Many well-known
problems are NP-complete, including the traveling salesman problem, the
knapsack problem, and the clique problem. Practical implications:
NP-completeness of a problem does not necessarily mean that it is difficult to
solve in practice. This means that as the size of the input increases, the time
required to solve the deterministic algorithm problem increases exponentially.
Approximation algorithms: For many NP-complete problems, finding an exact
solution in polynomial time is unlikely. Instead, approximation algorithms are
often used to find optimal solutions in a reasonable amount of time.
Applications: NP-complete problems have applications in many fields, including
cryptography, optimization, planning, and logistics. Heuristic approaches:
Since the exact solution of NP-complete problems is difficult, heuristic
approaches and approximation algorithms are often used in practice to find
solutions that are good enough for practical purposes. Understanding
NP-completeness has important implications for the limits of efficient
computing and has implications for the design of algorithms and problem-solving
strategies in computing.
Solving NP-completeness problems
NP-complete
problems are a difficult task because those problems are probably hard to solve
efficiently (in polynomial time) by definition. However, there are several
strategies and approaches that people use to solve NP-related problems:
1.     
Correct
algorithms: Although NP-complete
problems are generally difficult to solve exactly, cases of these problems can
be solved efficiently. Exact algorithms may be feasible for small problems or special
cases.
2.     
Approximation
algorithms: Instead of striving for an
exact solution, approximate algorithms provide solutions that are guaranteed to
be close to the optimal solution. These algorithms often have polynomial time
complexity and are suitable for practical applications.
3.     
Heuristic
algorithms: Heuristic algorithms are
designed to quickly find a good, although not necessarily the best, solution.
They are often used for large NP-complete problems where finding an exact
solution is impractical. Examples include genetic algorithms, simulated
annealing, and local search algorithms.
4.     
Problem-specific
techniques: Some NP-complete problems
have special properties that can be used to develop special algorithms.
Understanding the characteristics of a given problem can lead to the
development of more efficient algorithms.
5.     
Parametric
complexity: In parameterized complexity
theory, the goal is to identify parameters that make a problem easily solvable.
Although the problem is usually NP-complete, it can be solved in polynomial
time if certain parameters are small.
6.     
Parallel and
distributed computing: Some NP-complete problems
can be solved more efficiently using parallel or distributed computing.
Breaking down a large problem into smaller sub problems that can be solved
simultaneously can speed up the overall solution.
7.     
Problem-Based
Deductions: In some cases,
problem-based reductions can be used to transform a single NP-complete problem
instance into a problem instance with known properties. This can allow existing
algorithms or heuristics to be used in other problems. Special equipment: Using
specialized hardware such as GPUs or quantum computers can speed up certain
types of NP-related problems.
8.     
Pretreatment
and pruning: Preprocessing techniques
can be used to simplify the problem before applying more complex algorithms.
Pruning techniques can remove branches from the search space that are unlikely
to lead to optimal solutions.
9.     
Metaheuristics: Metaheuristic approaches such as genetic
algorithms, particle swarm optimization, and ant colony optimization are common
problem-solving strategies that can be adapted to a variety of NP-related
problems. It is important to note that the choice of approach depends on the
specific NP-complete problem, the size of the problem instance, and the
available resources. Many real-world applications use a combination of these
techniques to find practical solutions to NP-related problems. However, it is
important to understand that in certain cases and problem sizes, finding the
optimal solution can remain computationally challenging.
Approximation Algorithms and Randomized Algorithms
Randomized algorithms are algorithms that
use random numbers to determine what to do next at any point in their logic.
This means that it uses randomization for operation in the logic. Due to this
randomness, the time and space complexity of the standard algorithm is reduced.
In other words, randomized algorithms use
random numbers to find solutions to problems and improve the performance of the
algorithms. Such algorithms are categorized into the following two types:
1.     
Monte Carlo
randomized algorithm: These randomized algorithms
have a small probability that their output is incorrect.
2.     
Las Vegas
algorithm: These randomized algorithms
always give an accurate result or alert the user about its failure.

Approximation
algorithm
An approximation algorithm finds
approximate solutions for optimization problems. They do not always guarantee
to produce the best solution but they try to come close to the optimal value in
polynomial time. Approximation algorithms guarantee to provide highly accurate
solutions for optimization problems.
Randomized
approximation algorithm
Approximation algorithms are efficient
algorithms for solving optimization problems. However, when the concept of
randomness is used in approximation algorithms, it enhances the algorithm’s
performance.
Randomized approximation algorithms
efficiently solve problems for which no deterministic algorithm has been
identified. They are also used in the simulated annealing methods for solving
optimization problems.
Randomized algorithms do random
selections that vary depending on the input given to an algorithm to solve a
problem. If a significant part of deterministic algorithms works well for the
given input, a procedure to restart the algorithm after a particular time
during its execution is determined, which helps speed up the algorithms.
Randomized approximation algorithm
example (Max-Cut algorithm)
It is possible to prove that the max-cut
problem produces 2-approximation using the greedy approximation algorithm.
However, if randomization techniques are used in the max-cut problem, a shorter
algorithm that gives the same performance and executes in polynomial time can
be determined.
Random max-cut problem: Consider an
undirected graph G = (V, E) and positive weight for every edge, select a random
cut (partition V) in two subsets A, B such that it maximizes the combined
weight of edges with one endpoint in A and the other in B. If less than m/2
edges are intersecting within the cut, repeat the process.
Theorem: A random-cut is a ½ approximation
algorithm for a max-cut algorithm that executes in expected polynomial time.
Proof:
Consider X as a random variable
representing the number of edges intersecting in the random cut. For i = 1,
2,..., m, Xi will be the indicator variable. If the ith edge
intersects the random cut, the value of the indicator variable will be 1.
Otherwise, it will be 0. So, X = ∑Xii=1m∑Xii=1m and by the linearity
of expectation E[X] =∑E[Xi]i = 1mE[X] =∑E[Xi]i = 1m.
Now, the probability for an edge {u, v} to lie in a randomly selected cut is ½,
as the edges u and v will be placed in either of the two partitions. Hence, the
chances of u and v being in one partition will be ½. Therefore, E[Xi] = ½ for
every i. Hence, E[Xi] = m/2.
This proves that the random cut
will give at least m/2 crossing edges. Now, the goal is to find a randomized
algorithm that generates a good cut every time, and its execution time is a
random variable with a polynomial-time execution. This can be done by computing
the probability that X ≥ m/2X ≥ m/2  when a
random cut is selected. In the worst case, ifX ≥m/2X ≥m/2, all the
probability is weighted on m, and if X < m/2, all the probability is
weighted on m/2 - 1. This makes the expectation of X the highest.
m/2 = E[X] ≤(1 − P r[X ≥ m/2] (m/2 − 1) + P r[X ≥ m/2]m.m/2 = E[X] ≤(1 - P r[X ≥ m/2] (m/2 - 1) + P r[X ≥ m/2]m.
By
solving P r[X ≥ m/2]P r[X ≥ m/2], the
solution will be at least 2/(m+2). This means the maximum number of total
iterations in the algorithm will be (m+2) / 2. Hence proved that the algorithm
executes within expected polynomial time and generates a random cut of size at
m/2. In other terms, it can be marked that the expected weight of the generated
cut is at least half the weight of the maximum cut.
Application of randomized
approximation algorithm
One of the main applications of random
approximation algorithms is randomized backtracking. Backtracking helps in
solving combinatorial problems and it typically takes exponential time for
execution. However, recent research proved that when the concept of randomness
is involved in backtracking problems, they can be solved much quicker.
Backtracking problems typically contain
small substructures in them. Each substructure has a property that the solution
for an entire problem can be identified if anyone substructure is solved
correctly. Due to this reason, these problems become tenable when solved using
randomization techniques. The overall execution time that backtracking takes to
generate a solution is reduced impressively by constantly restarting the
backtracking technique after a polynomial time.
 
 
Comments
Post a Comment