Home List your patent My account Help Support us

Dispositional Turing Machine

[Category : - SOFTWARES]
[Viewed 1862 times]

Dispositional Turing Machine
==============
FIELD OF INVENTION
===========

This invention aims to give computers the ability to retain information that it receives in the form of text or a voice interpretation or any other input system done using natural language. This would allow the computer to respond with intelligent responses to questions it is asked. This would be useful in optimising search engine performance as well as in processing information that is available in natural language format.

Background of invention
===========

Currently there have been many attempts to endow machines with the ability to compute using concepts. The earliest systems used rule based algorithms where the information that was to be given by a computer in response to a query was manually entered into a database in such a way as to make it easy for the computer to return it. Albeit there was only so much a machine could know under this schema and new questions were difficult for it to interpret. Then a move was made to use statistical methods coupled with machine learning algorithms that would automatically derive the relationships between words and use this "understanding" to answer simple questions. At present methods of tagging part of speech elements in sentences in a structured way that makes it easy for the machine to manipulate the data that has been tagged show some promise. This represents a mix between the old expert systems in terms of the training set having to be manually built and statistical methods where new data is sorted out using the rules that the machine would have learnt from the training set. This invention is a retake on the old statistical approach. It is a development of the ideas found in patent...... This document outlines a method for the computer to give a response to either a user or another computer based on the concepts in the sentence that it used in its query. The concepts are compared to a database of predetermined statements in order to give a reply. Weights are applied to the concepts to determine their effect of the result that is returned. The problems that have not been met by this and other inventions is that the machine cannot manipulate information in a way that represents a disposition to respond to questions in a general way. In both specific and general scope, such that it can be said to choose what to say dynamically, and not by rote. It is also not possible to train the machine to retain larger features in the training data such as abstractions from the data set that are not contained in explicit statements within that set. This means that it cannot learn about things which it is not explicitly programmed to know.

OBJECT OF THE INVENTION
===========

This invention seeks to solve the problem of a dual database, where the representation of words in the databases of contemporary solutions does not have as a property, a disposition for appearing in specific sentences in a specific role. This disposition represents knowledge that the system has acquired during training. The solution is in the form of providing probabilities for the words in an oncoming sentence being present in it given the specific words, in specific order, in the current sentence. During training the system will read through a large corpus of material. Comparing each sentence to the next and assigning a weight to each mapping of the words in one sentence to the other. The problem of how to respond intelligently to a query or a search phrase then becomes the systems disposition to return the words which correlate with the words in the question and have the highest value when considering all those words. The sentence returned will represent an intelligent response informed by the training data that the system was trained on. Another problem to be solved is machine understanding. this is allows the machine to be fed are large corpus of text on a specific field , then queries will be made to the system that represent the information that can be deduced from that corpus.

STATEMENT OF INVENTION
=======
This invention is a artificial intelligence system that retains information by assigning weights to word pairs as found in consecutive sentences, that establish a disposition for one group of words, with the most weights to follow after a given sentence that determines the weights. Short term memory is stored in the form of cumulative effects of each statement on the weights of the following statements. Additionally during trading, sections of the training data will be labelled as belonging to similar topics and this information would be used to place further sensitivity to the short term memory.

SUMMARY OF INVENTION
=======
This invention is of a natural language understanding system that goes through a learning phase to learn the relationship between words in consecutive sentences to establish a pattern. Weights are then assigned to each word to word pair taking care to specify the position of the word in each sentence. As training continues these weights are adjusted. During operation phase the weights in the database are manipulated depending on the specific words in the current sentence. This manipulation represents the disposition for the system to return a particular phrase or sentence.

DETAILED DESCRIPTION OF INVENTION
=========

A database with all the words in a particular language may be stored with no relationship between words established in one embodiment. The system is then given a large corpus of material to go through.
On the first two sentences, it creates a one to many mapping between each word in the first sentence and all the words in the next. The mapping is then stored in the database as a relationship between those words.
The order of the words in each sentence is specified in these relationships.
A weight of 50 % is applied to each of those mappings.

As the system continues reading the corpus three different events will occur and have different consequences.

1. One of the words that were mapped with before is related again with another in the exact same manner.
the consequence is that the weight attached to that mapping increases by a percentage.

2. One of the words met previously occurs again in a preceding statement but does not duplicate any of the relationships it had established earlier with the next sentence.

Consequence is the weight associated with all its previous mappings are reduced by a percentage.

3. A new word pair is met.

Consequence, a weight of 50% is applied to this mapping.

This method will have as an effect give word pairs that occur one in a preliminary sentence and the other in the oncoming sentence a strong correlation. The more that this occurs the stronger the correlation will be learnt , which is the purpose of point 1. Point 2 serves to make less correlated the words that were previously strongly correlated because instances where they did not pair up have been discovered. Point 3 is to initialise new word pairs that will themselves be subject to point 1 and point 2 as the training commences.

As the system is reading it creates a set of results for a certain number of consecutive sentences where this certain number is chosen according to the frequency of certain terms occurring in that chain of sentence and not elsewhere , then it labels this group of results a. Words pairs in specific topics will have their weights adjusted to reflect their increased correlativeness within a topic. This serves to group concepts that are related by being the focus of the present discourse to correlate slightly stronger than those in another discourse. This will be helpful in narrowing down which concepts to return should the choice be otherwise unclear to the system. Topics themselves will be grouped into larger topics made of concatenating the smaller ones.

Before the system is ready it must go through another form of training. This will take the form of a beta trial where it is placed in the working environment, which may be a question and answer session where a human user asks it questions on what it has read. A copy of the database is first loaded onto the computer. This new copy will have a weight of 1 assigned to all the word pairs. As the words in each question are entered into the system, the system adjusts the relevant weights of the words related to them by multiplying the learned values from the trained database, with the values on the copy of the database. The words in the question will be basically casting a vote in the database for which word they would have appear in the answer , and at what position in the statement. Their combined "opinion" will fully determine the words used and their order in the next statement. Once the question has been asked the system will return the words with the highest value in the database in the order they have been predicted .On encountering the next question all the weights in the copy database get scaled down, then new weights are determined as before with the new questions words. note that because we have scaled down the weights from the first question and not put them at 1 we have retained a preference to output those words in the next sentence (representing short term memory) that may or not be overridden when the weights determined by the next statement are multiplied on the existing weights in the copy of the database. The sentence will end naturally where there is a very large gap between the last two weights in the database. Further refinement is needed to ensure that sentence is well structured. Additionally in this beta trial, when the system forms a statement. it will go online and use a search engine to search for that exact sentence that it has formed. If it finds it , then it strengthens the weights for the word pairs between the sentences so as to increase the gap between the last word in the sentence and the one that would have followed it if the system had continued building the sentence.

If it does not find the sentence, the system per mutates the words in the sentence until a match is found on the internet. Then the weights are changed accordingly in the database to reflect this new information.

During operation the system will now be able to answer questions pertaining to the subject matter it had been trained on.

A consideration is that with disparate reading material, the choice for consecutive sentences becomes determined by the current topic of discourse.

A topic would be a grouping of concepts that recur often within a certain number of sentences. Or a pattern that emerges from what relationships the words have as a whole in the group of sentences.

A question and answer session will take on a certain structure when being learnt represented by the prevalence of certain terms in a certain role/position within the sentences that would then form a topic. This information would be represented in the copy of the database by the system identifying the pattern through tracking the topology of the distribution of weights in the database. E.g. if it has encountered allot of "why's" and "what’s" it adjusts the weights for the words in the database that feature in many answers, such as “because”, so that when a question is asked the system can track what it’s expected to return which would be an answer, because as it distributes the weights the word “because” has already got a significant weight which will only increase once multiplied with the weights given it by the words in the current statement, to make it likely used in the answer. This topicalisation would help the system choose the sentence with the right role in a phase of operation whether it is required to write an exposition or answer questions or search for the result of the question in a certain manner.

The meat of the idea is in the different levels used to predict responses. Word to word mappings give a general way to predict what is likely to be returned in the next statement. the topics chosen or identified by the system (in terms of semi discrete changes in how the weights are being assigned and the resultant topology of the database) then refine the choice of which sentences should be returned by influencing the weights further to reflect a preference for choosing those terms related to the topic.

There would be different ways of grouping topics, some which are not mutually exclusive and some that are. The simplest would be formed by identifying large scale changes in the weights of the words in the database. As patterns appear in this large scale representation, unsupervised learning techniques during training such as clumping and other frequency distributed approaches will segment the database weighting responsiveness’s.

Patent publications:

No publication

Asking price:

Make an offer

Rate this patent

Great invention
Liked: 1 times

Viewed: 1862 times

Seller:

moyo (Zimbabwe)

Ask a question
or start a negotiation

Contact the seller

Share on social media