Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
433 views
in Technique[技术] by (71.8m points)

String Searching Algorithm that uses a graph ? C++

Code Instructions

Hey guys. Above is a coding project I have been assigned. Im reading the instructions and am completely lost because I've never learned how to code an undirected graph? Not sure how my professor expects us to do this but I was hoping I could get some help from experts. Any readings (or tips) you guys suggest I can look at to familiarize myself with how to get started on the program? Appreciate it, thanks!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The problem to solve is called "Word Morph". Your instructor gave some restrictions as to use an undirected graph, where the neighbour node differs only one character from the origin. Unfortuantely the requirements are not clear enough. "Differ by one character is ambiguous. If we use the replace-insert-delete idiom, then we can use other functions as by comparing 2 equal size strings. I assume the full approach.

And, at the end, you need to find a shortest way through a graph.

I could present you one possible solution. A complete working code example.

By the way the graph is non-weigthed, because the cost of travelling from one node to the next is always 1. So actually we are talking about an undirected non-weighted graph.

The main algorithms we need using here are:

  • Levensthein. Calculate distance of 2 strings
  • and Breadth First Search, to find the shortes path through a graph

Please note, If the words should have the same length, then no Levensthein is needed. Just compare character by charcter and count the differences. That's rather simple. (But as said: The instructions are a little bit unclear)

Both algorithms can be modified. For example: You do not need a Levensthein distance greater than 1. You can terminate the distance calculation after distance one has been found. And, in the breadth first search, you could show the path, through which you are going.

OK, now, how to implement an undirected graph. There are 2 possibilities:

  1. A Matrix (I will not explain)
  2. A list or a vector

I would recommend the vector approach for this case. A matrix would be rather sparse, so, better a vector.

The basic data structure that you need is a node containing vertices and neighbours. So you have the word (a std::string) as vertex and the "neighbours". That is a std::vector of index positions to other nodes in the graph.

The graph is a vector of nodes. And nodes neighbours point to other nodes in this vector. We use the index into the vector to denote neighbours. All this we pack into a structure and call it "UndirectedGraph". We add a "build" function that checks for adjacencies. In this function we compare each string with any and check, if the difference is <2, so 0 or 1. 0 means equeal and 1 is a given constraint. If we find such a difference, we add it as neighboour in the corresponding node.

Additionally we add a breadth first search algorithm. It is described in Wikipedia

To ease up the implementation of that algortuhm we a a "visited" flag to the node.

Please see the code below:

#include <sstream>
#include <iostream>
#include <vector>
#include <string>
#include <iterator>
#include <iomanip>
#include <numeric>
#include <algorithm>
#include <queue>

std::istringstream textFileStream(R"#(peach
peace
place
plane
plans
plays
slays
stays
stars
sears
years
yearn
)#");

using Vertex = std::string;
using Edge = std::vector<size_t>;

// One node in a graph
struct Node
{
    // The Vertex is a std::string
    Vertex word{};
    // The edges are the index of the neighbour nodes
    Edge neighbour{};
    // For Breath First Search
    bool visited{ false };

    // Easy input and output
    friend std::istream& operator >> (std::istream& is, Node& n) {
        n.neighbour.clear();
        std::getline(is, n.word);
        return is;
    }
    friend std::ostream& operator << (std::ostream& os, const Node& n) {
        os << std::left << std::setw(25) << n.word << " --> ";
        std::copy(n.neighbour.begin(), n.neighbour.end(), std::ostream_iterator<int>(os, " "));
        return os;
    }
};

// The graph
struct UndirectedGraph
{
    // Contains a vector of nodes
    std::vector<Node> graph;

    // build adjacenies
    void build();

    // Find Path
    bool checkForPathFromStartToEnd(size_t start, size_t end);
    bool checkForPath() {bool result = false;if (graph.size() > 1) {size_t s = graph.size() - 2;size_t e = s + 1;result = checkForPathFromStartToEnd(s, e); }return result; }

    // Easy input and output
    friend std::istream& operator >> (std::istream& is, UndirectedGraph& ug) {
        ug.graph.clear();
        std::copy(std::istream_iterator<Node>(is), std::istream_iterator<Node>(), std::back_inserter(ug.graph));
        return is;
    }
    friend std::ostream& operator << (std::ostream& os, const UndirectedGraph& ug) {
        size_t i{ 0 };

        for (const Node& n : ug.graph)
            os << std::right << std::setw(4) << i++ << ' ' << n << '
';
        return os;
    }

};

// Distance between 2 strings
size_t levensthein(const std::string& string1, const std::string& string2)
{
    const size_t lengthString1(string1.size());
    const size_t lengthString2(string2.size());

    if (lengthString1 == 0) return lengthString2;
    if (lengthString2 == 0) return lengthString1;

    std::vector<size_t> costs(lengthString2 + 1);
    std::iota(costs.begin(), costs.end(), 0);

    for (size_t indexString1 = 0; indexString1 < lengthString1; ++indexString1) {
        costs[0] = indexString1 + 1;
        size_t corner = indexString1;

        for (size_t indexString2 = 0; indexString2 < lengthString2; ++indexString2) {
            size_t upper = costs[indexString2 + 1];
            if (string1[indexString1] == string2[indexString2]) {
                costs[indexString2 + 1] = corner;
            }
            else {
                const size_t temp = std::min(upper, corner);
                costs[indexString2 + 1] = std::min(costs[indexString2], temp) + 1;
            }
            corner = upper;
        }
    }
    size_t result = costs[lengthString2];
    return result;
}

// Build the adjacenies
void UndirectedGraph::build()
{
    // Iterate over all words in the graph
    for (size_t i = 0; i < graph.size(); ++i)
        // COmpare everything with everything (becuase of symmetries, omit half of comparisons)
        for (size_t j = i + 1; j < graph.size(); ++j)
            // Chec distance of the 2 words to compare
            if (levensthein(graph[i].word, graph[j].word) < 2U) {
                // And store the adjacenies
                graph[i].neighbour.push_back(j);
                graph[j].neighbour.push_back(i);
            }
}
bool UndirectedGraph::checkForPathFromStartToEnd(size_t start, size_t end)
{
    // Assume that it will not work
    bool result = false;

    // Store intermediate tries in queue
    std::queue<size_t> check{};

    // Set initial values
    graph[start].visited = true;
    check.push(start);

    // As long as we have not visited all possible nodes
    while (!check.empty()) {
        // Get the next node to check
        size_t currentNode = check.front(); check.pop();
        // If we found the solution . . .
        if (currentNode == end) {
            // The set resultung value and stop searching
            result = true;
            break;
        }
        // Go through all neighbours of the current node
        for (const size_t next : graph[currentNode].neighbour) {
            // If the neighbour node has not yet been visited
            if (!graph[next].visited) {
                // Then visit it
                graph[next].visited = true;
                // And check following elements next time
                check.push(next);
            }
        }
    }
    return result;
}

int main()
{
    // Get the filename from the user
    std::cout << "Enter Filename for file with words:
";
    std::string filename{};
    //std::cin >> filename;

    // Open the file
    //std::ifstream textFileStream(filename);
    // If the file could be opened . . .
    if (textFileStream) {

        // Create an empty graph
        UndirectedGraph undirectedGraph{};

        // Read the complete file into the graph
        textFileStream >> undirectedGraph;

        Node startWord{}, targetWord{};
        std::cout << "Enter start word and enter target word
";  // teach --> learn 
        std::cin >> startWord >> targetWord;

        // Add the 2 words at the and of our graph
        undirectedGraph.graph.push_back(startWord);
        undirectedGraph.graph.push_back(targetWord);

        // Build adjacency graph, including the just added words
        undirectedGraph.build();
        // For debug purposes: Show the graph
        std::cout << undirectedGraph;

        std::cout << "

Morph possible?  --> " << std::boolalpha << undirectedGraph.checkForPath() << '
';
    }
    else {
        // File could not be found or opened
        std::cerr << "Error: Could not open file : " << filename;
    }
    return 0;
}

Please note: Although I have implemented asking for a file name, I do not use it in this example. I read from a istringstream. You need to delete the istringstream later and comment in the existing statements.

Reagarding the requirements from the instructor: I did not use any STL/Library/Boost searching algorithm. (What for? In this example?) But I use of course other C++ STL container. I will not reinvent the wheel and come up with a new "vector" or queue. And I will definitely not use "new" or C-Style arrays or pointer arithmetic.

Have fun!

And to all others: Sorry: I could not resist to write the code . . .


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...