Richardson–Lucy deconvolution

Richardson–Lucy deconvolution

The Richardson–Lucy algorithm, also known as Lucy–Richardson deconvolution, is an iterative procedure for recovering an underlying image that has been blurred by a known point spread function. It was named after William Richardson and Leon B. Lucy, who described it independently. == Description == When an image is produced using an optical system and detected using photographic film, a charge-coupled device or a CMOS sensor, for example, it is inevitably blurred, with an ideal point source not appearing as a point but being spread out into what is known as the point spread function. Extended sources can be decomposed into the sum of many individual point sources, thus the observed image can be represented in terms of a transition matrix p operating on an underlying image: d i = ∑ j p i , j u j , {\displaystyle d_{i}=\sum _{j}p_{i,j}u_{j},} where u j {\displaystyle u_{j}} is the intensity of the underlying image at pixel j {\displaystyle j} , and d i {\displaystyle d_{i}} is the detected intensity at pixel i {\displaystyle i} . In general, a matrix whose elements are p i , j {\displaystyle p_{i,j}} describes the portion of light from source pixel j that is detected in pixel i. In most good optical systems (or in general, linear systems that are described as shift-invariant) the transfer function p can be expressed simply in terms of the spatial offset between the source pixel j and the observation pixel i: p i , j = P ( i − j ) , {\displaystyle p_{i,j}=P(i-j),} where P ( Δ i ) {\displaystyle P(\Delta i)} is called a point spread function. In that case the above equation becomes a convolution. This has been written for one spatial dimension, but most imaging systems are two-dimensional, with the source, detected image, and point spread function all having two indices. So a two-dimensional detected image is a convolution of the underlying image with a two-dimensional point spread function P ( Δ x , Δ y ) {\displaystyle P(\Delta x,\Delta y)} plus added detection noise. In order to estimate u j {\displaystyle u_{j}} given the observed d i {\displaystyle d_{i}} and a known P ( Δ i x , Δ j y ) {\displaystyle P(\Delta i_{x},\Delta j_{y})} , the following iterative procedure is employed in which the estimate of u j {\displaystyle u_{j}} (called u ^ j ( t ) {\displaystyle {\hat {u}}_{j}^{(t)}} ) for iteration number t is updated as follows: u ^ j ( t + 1 ) = u ^ j ( t ) ∑ i d i c i p i j , {\displaystyle {\hat {u}}_{j}^{(t+1)}={\hat {u}}_{j}^{(t)}\sum _{i}{\frac {d_{i}}{c_{i}}}p_{ij},} where c i = ∑ j p i j u ^ j ( t ) , {\displaystyle c_{i}=\sum _{j}p_{ij}{\hat {u}}_{j}^{(t)},} and ∑ j p i j = 1 {\displaystyle \sum _{j}p_{ij}=1} is assumed. It has been shown empirically that if this iteration converges, it converges to the maximum likelihood solution for u j {\displaystyle u_{j}} . Writing this more generally for two (or more) dimensions in terms of convolution with a point spread function P: u ^ ( t + 1 ) = u ^ ( t ) ⋅ ( d u ^ ( t ) ⊗ P ⊗ P ∗ ) , {\displaystyle {\hat {u}}^{(t+1)}={\hat {u}}^{(t)}\cdot \left({\frac {d}{{\hat {u}}^{(t)}\otimes P}}\otimes P^{}\right),} where the division and multiplication are element-wise, ⊗ {\displaystyle \otimes } indicates a 2D convolution, and P ∗ {\displaystyle P^{}} is the mirrored point spread function, or the inverse Fourier transform of the Hermitian transpose of the optical transfer function. In problems where the point spread function p i j {\displaystyle p_{ij}} is not known a priori, a modification of the Richardson–Lucy algorithm has been proposed, in order to accomplish blind deconvolution. == Derivation == In the context of fluorescence microscopy, the probability of measuring a set of number of photons (or digitalization counts proportional to detected light) m = [ m 0 , … , m K ] {\displaystyle \mathbf {m} =[m_{0},\dots ,m_{K}]} for expected values E = [ E 0 , … , E K ] {\displaystyle \mathbf {E} =[E_{0},\dots ,E_{K}]} for a detector with K + 1 {\displaystyle K+1} pixels is given by P ( m ∣ E ) = ∏ i K Poisson ⁡ ( E i ) = ∏ i K E i m i e − E i m i ! . {\displaystyle P(\mathbf {m} \mid \mathbf {E} )=\prod _{i}^{K}\operatorname {Poisson} (E_{i})=\prod _{i}^{K}{\frac {E_{i}^{m_{i}}e^{-E_{i}}}{m_{i}!}}.} Since in the context of maximum-likelihood estimation the aim is to locate the maximum of the likelihood function without concern for its absolute value, it is convenient to work with ln ⁡ ( P ) {\displaystyle \ln(P)} : ln ⁡ P ( m ∣ E ) = ∑ i K [ ( m i ln ⁡ E i − E i ) − ln ⁡ ( m i ! ) ] . {\displaystyle \ln P(\mathbf {m} \mid \mathbf {E} )=\sum _{i}^{K}[(m_{i}\ln E_{i}-E_{i})-\ln(m_{i}!)].} Moreover, since ln ⁡ ( m i ! ) {\displaystyle \ln(m_{i}!)} is a constant, it does not give any additional information regarding the position of the maximum, so consider α ( m ∣ E ) = ∑ i K [ m i ln ⁡ E i − E i ] , {\displaystyle \alpha (\mathbf {m} \mid \mathbf {E} )=\sum _{i}^{K}[m_{i}\ln E_{i}-E_{i}],} where α {\displaystyle \alpha } is something that shares the same maximum position as P ( m ∣ E ) {\displaystyle P(\mathbf {m} \mid \mathbf {E} )} . Now consider that E {\displaystyle \mathbf {E} } comes from a ground truth x {\displaystyle \mathbf {x} } and a measurement H {\displaystyle \mathbf {H} } which is assumed to be linear. Then E = H x , {\displaystyle \mathbf {E} =\mathbf {H} \mathbf {x} ,} where a matrix multiplication is implied. This can also be written in the form E m = ∑ n K H m n x n , {\displaystyle E_{m}=\sum _{n}^{K}H_{mn}x_{n},} where it can be seen how H {\displaystyle H} mixes or blurs the ground truth. It can also be shown that the derivative of an element of E {\displaystyle \mathbf {E} } , ( E i ) {\displaystyle (E_{i})} with respect to some other element of x j {\displaystyle x_{j}} can be written as It is easy to see this by writing a matrix H {\displaystyle \mathbf {H} } of, say, 5 × 5 and two arrays E {\displaystyle \mathbf {E} } and x {\displaystyle \mathbf {x} } of 5 elements and check it. This last equation can be interpreted as how much one element of x {\displaystyle \mathbf {x} } , say element i {\displaystyle i} , influences the other elements j ≠ i {\displaystyle j\neq i} (and of course the case i = j {\displaystyle i=j} is also taken into account). For example, in a typical case an element of the ground truth x {\displaystyle \mathbf {x} } will influence nearby elements in E {\displaystyle \mathbf {E} } but not the very distant ones (a value of 0 {\displaystyle 0} is expected on those matrix elements). Now, the key and arbitrary step: x {\displaystyle \mathbf {x} } is not known but may be estimated by x ^ {\displaystyle {\hat {\mathbf {x} }}} . Let's call x ^ old {\displaystyle {\hat {\mathbf {x} }}_{\text{old}}} and x ^ new {\displaystyle {\hat {\mathbf {x} }}_{\text{new}}} the estimated ground truths while using the RL algorithm, where the hat symbol is used to distinguish ground truth from estimator of the ground truth where ∂ ∂ x {\displaystyle {\frac {\partial }{\partial \mathbf {x} }}} stands for a K {\displaystyle K} -dimensional gradient. Performing the partial derivative of α ( m ∣ E ( x ) ) {\displaystyle \alpha (\mathbf {m} \mid \mathbf {E} (\mathbf {x} ))} yields the following expression: ∂ α ( m ∣ E ( x ) ) ∂ x j = ∂ ∂ x j ∑ i K [ m i ln ⁡ E i − E i ] = ∑ i K [ m i E i ∂ ∂ x j E i − ∂ ∂ x j E i ] = ∑ i K ∂ E i ∂ x j [ m i E i − 1 ] . {\displaystyle {\frac {\partial \alpha (\mathbf {m} \mid \mathbf {E} (\mathbf {x} ))}{\partial x_{j}}}={\frac {\partial }{\partial x_{j}}}\sum _{i}^{K}[m_{i}\ln E_{i}-E_{i}]=\sum _{i}^{K}\left[{\frac {m_{i}}{E_{i}}}{\frac {\partial }{\partial x_{j}}}E_{i}-{\frac {\partial }{\partial x_{j}}}E_{i}\right]=\sum _{i}^{K}{\frac {\partial E_{i}}{\partial x_{j}}}\left[{\frac {m_{i}}{E_{i}}}-1\right].} By substituting (1), it follows that ∂ α ( m ∣ E ( x ) ) ∂ x j = ∑ i K H i j [ m i E i − 1 ] . {\displaystyle {\frac {\partial \alpha (\mathbf {m} \mid \mathbf {E} (\mathbf {x} ))}{\partial x_{j}}}=\sum _{i}^{K}H_{ij}\left[{\frac {m_{i}}{E_{i}}}-1\right].} Note that H j i T = H i j {\displaystyle H_{ji}^{T}=H_{ij}} by the definition of a matrix transpose. And hence Since this equation is true for all j {\displaystyle j} spanning all the elements from 1 {\displaystyle 1} to K {\displaystyle K} , these K {\displaystyle K} equations may be compactly rewritten as a single vectorial equation ∂ α ( m ∣ E ( x ) ) ∂ x = H T [ m E − 1 ] , {\displaystyle {\frac {\partial \alpha (\mathbf {m} \mid \mathbf {E} (\mathbf {x} ))}{\partial \mathbf {x} }}=\mathbf {H} ^{T}\left[{\frac {\mathbf {m} }{\mathbf {E} }}-\mathbf {1} \right],} where H T {\displaystyle \mathbf {H} ^{T}} is a matrix, and m {\displaystyle \mathbf {m} } , E {\displaystyle \mathbf {E} } and 1 {\displaystyle \mathbf {1} } are vectors. Now, as a seemingly arbitrary but key step, let where 1 {\displaystyle \mathbf {1} } is a vector of ones of size K {\displaystyle K} (same as m {\displaystyle \mathbf {m} } , E {\displaystyle \mathbf {E} } and x {\displaystyle \mathbf {x} } ), and the d

Social software engineering

Social software engineering (SSE) is a branch of software engineering that is concerned with the social aspects of software development and the developed software. SSE focuses on the socialness of both software engineering and developed software. On the one hand, the consideration of social factors in software engineering activities, processes and CASE tools is deemed to be useful to improve the quality of both development process and produced software. Examples include the role of situational awareness and multi-cultural factors in collaborative software development. On the other hand, the dynamicity of the social contexts in which software could operate (e.g., in a cloud environment) calls for engineering social adaptability as a runtime iterative activity. Examples include approaches which enable software to gather users' quality feedback and use it to adapt autonomously or semi-autonomously. SSE studies and builds socially-oriented tools to support collaboration and knowledge sharing in software engineering. SSE also investigates the adaptability of software to the dynamic social contexts in which it could operate and the involvement of clients and end-users in shaping software adaptation decisions at runtime. Social context includes norms, culture, roles and responsibilities, stakeholder's goals and interdependencies, end-users perception of the quality and appropriateness of each software behaviour, etc. The participants of the 1st International Workshop on Social Software Engineering and Applications (SoSEA 2008) proposed the following characterization: Community-centered: Software is produced and consumed by and/or for a community rather than focusing on individuals Collaboration/collectiveness: Exploiting the collaborative and collective capacity of human beings Companionship/relationship: Making explicit the various associations among people Human/social activities: Software is designed consciously to support human activities and to address social problems Social inclusion: Software should enable social inclusion enforcing links and trust in communities Thus, SSE can be defined as "the application of processes, methods, and tools to enable community-driven creation, management, deployment, and use of software in online environments". One of the main observations in the field of SSE is that the concepts, principles, and technologies made for social software applications are applicable to software development itself as software engineering is inherently a social activity. SSE is not limited to specific activities of software development. Accordingly, tools have been proposed supporting different parts of SSE, for instance, social system design or social requirements engineering. Consequently vertical market software, such as software development tools, engineering tools, marketing tools or software that helps users in a decision-making process can profit from social components. Such vertical social software differentiates strongly in its user-base from traditional social software such as Yammer.

Operational system

An operational system is a term used in data warehousing to refer to a system that is used to process the day-to-day transactions of an organization. These systems are designed in a manner that processing of day-to-day transactions is performed efficiently and the integrity of the transactional data is preserved. == Synonyms == Sometimes operational systems are referred to as operational databases, transaction processing systems, or online transaction processing systems (OLTP). However, the use of the last two terms as synonyms may be confusing, because operational systems can be batch processing systems as well. Any enterprise must necessarily maintain a lot of data about its operation.

Microsoft SQL Server Master Data Services

Microsoft SQL Server Master Data Services (MDS) is a Master Data Management (MDM) product from Microsoft that ships as a part of the Microsoft SQL Server relational database management system. Master data management (MDM) allows an organization to discover and define non-transactional lists of data, and compile maintainable, reliable master lists. Master Data Services first shipped with Microsoft SQL Server 2008 R2. Microsoft SQL Server 2016 introduced enhancements to Master Data Services, such as improved performance and security, and the ability to clear transaction logs, create custom indexes, share entity data between different models, and support for many-to-many relationships. == Overview == In Master Data Services, the model is the highest level container in the structure of your master data. You create a model to manage groups of similar data. A model contains one or more entities, and entities contain members that are the data records. An entity is similar to a table. Like other MDM products, Master Data Services aims to create a centralized data source and keep it synchronized, and thus reduce redundancies, across the applications which process the data. Sharing the architectural core with Stratature +EDM, Master Data Services uses a Microsoft SQL Server database as the physical data store. It is a part of the Master Data Hub, which uses the database to store and manage data entities. It is a database with the software to validate and manage the data, and keep it synchronized with the systems that use the data. The master data hub has to extract the data from the source system, validate, sanitize and shape the data, remove duplicates, and update the hub repositories, as well as synchronize the external sources. The entity schemas, attributes, data hierarchies, validation rules and access control information are specified as metadata to the Master Data Services runtime. Master Data Services does not impose any limitation on the data model. Master Data Services also allows custom Business rules, used for validating and sanitizing the data entering the data hub, to be defined, which is then run against the data matching the specified criteria. All changes made to the data are validated against the rules, and a log of the transaction is stored persistently. Violations are logged separately, and optionally the owner is notified, automatically. All the data entities can be versioned. Master Data Services allows the master data to be categorized by hierarchical relationships, such as employee data are a subtype of organization data. Hierarchies are generated by relating data attributes. Data can be automatically categorized using rules, and the categories are introspected programmatically. Master Data Services can also expose the data as Microsoft SQL Server views, which can be pulled by any SQL-compatible client. It uses a role-based access control system to restrict access to the data. The views are generated dynamically, so they contain the latest data entities in the master hub. It can also push out the data by writing to some external journals. Master Data Services also includes a web-based UI for viewing and managing the data. It uses ASP.NET in the back-end. The Silverlight front-end was replaced with HTML5 in SQL Server 2019. Master Data Services provides a Web service interface to expose the data, as well as an API, which internally uses the exposed web services, exposing the feature set, programmatically, to access and manipulate the data. It also integrates with Active Directory for authentication purposes. Unlike +EDM, Master Data Services supports Unicode characters, as well as support multilingual user interfaces. SQL Server 2016 introduced a significant performance increase in Master Data Services over previous versions. == Terminology == Model is the highest level of an MDS instance. It is the primary container for specific groupings of master data. In many ways it is very similar to the idea of a database. Entities are containers created within a model. Entities provide a home for members, and are in many ways analogous to database tables. (e.g. Customer) Members are analogous to the records in a database table (Entity) e.g. Will Smith. Members are contained within entities. Each member is made up of two or more attributes. Attributes are analogous to the columns within a database table (Entity) e.g. Surname. Attributes exist within entities and help describe members (the records within the table). Name and Code attributes are created by default for each entity and serve to describe and uniquely identify leaf members. Attributes can be related to other attributes from other entities which are called 'domain-based' attributes. This is similar to the concept of a foreign key. Other attributes however, will be of type 'free-form' (most common) or 'file'. Attribute Groups are explicitly defined collections of particular attributes. Say you have an entity "customer" that has 50 attributes — too much information for many of your users. Attribute groups enable the creation of custom sets of hand-picked attributes that are relevant for specific audiences. (e.g. "customer - delivery details" that would include just their name and last known delivery address). This is very similar to a database view. Hierarchies organize members into either Derived or Explicit hierarchical structures. Derived hierarchies, as the name suggests, are derived by the MDS engine based on the relationships that exist between attributes. Explicit hierarchies are created by hand using both leaf and consolidated members. Business Rules can be created and applied against model data to ensure that custom business logic is adhered to. In order to be committed into the system data must pass all business rule validations applied to them. e.g. Within the Customer Entity you may want to create a business rule that ensures all members of the 'Country' Attribute contain either the text "USA" or "Canada". The Business Rule once created and ran will then verify all the data is correct before it accepts it into the approved model. Versions provide system owners / administrators with the ability to Open, Lock or Commit a particular version of a model and the data contained within it at a particular point in time. As the content within a model varies, grows or shrinks over time versions provide a way of managing metadata so that subscribing systems can access to the correct content.

Controlled vocabulary

A controlled vocabulary provides a way to organize knowledge for subsequent retrieval. Controlled vocabularies are used in subject indexing schemes, subject headings, thesauri, taxonomies and other knowledge organization systems. Controlled vocabulary schemes mandate the use of predefined, preferred terms that have been preselected by the designers of the schemes, in contrast to natural language vocabularies, which have no such restriction. == In library and information science == In library and information science, controlled vocabulary is a carefully selected list of words and phrases, which are used to tag units of information (document or work) so that they may be more easily retrieved by a search. Controlled vocabularies solve the problems of homographs, synonyms and polysemes by a bijection between concepts and preferred terms. In short, controlled vocabularies reduce unwanted ambiguity inherent in normal human languages where the same concept can be given different names and ensure consistency. For example, in the Library of Congress Subject Headings (a subject heading system that uses a controlled vocabulary), preferred terms—subject headings in this case—have to be chosen to handle choices between variant spellings of the same word (American versus British), choice among scientific and popular terms (cockroach versus Periplaneta americana), and choices between synonyms (automobile versus car), among other difficult issues. Choices of preferred terms are based on the principles of user warrant (what terms users are likely to use), literary warrant (what terms are generally used in the literature and documents), and structural warrant (terms chosen by considering the structure, scope of the controlled vocabulary). Controlled vocabularies also typically handle the problem of homographs with qualifiers. For example, the term pool has to be qualified to refer to either swimming pool or the game pool to ensure that each preferred term or heading refers to only one concept. === Types used in libraries === There are two main kinds of controlled vocabulary tools used in libraries: subject headings and thesauri. While the differences between the two are diminishing, there are still some minor differences: Historically, subject headings were designed to describe books in library catalogs by catalogers while thesauri were used by indexers to apply index terms to documents and articles. Subject headings tend to be broader in scope describing whole books, while thesauri tend to be more specialized covering very specific disciplines. Because of the card catalog system, subject headings tend to have terms that are in indirect order (though with the rise of automated systems this is being removed), while thesaurus terms are always in direct order. Subject headings tend to use more pre-coordination of terms such that the designer of the controlled vocabulary will combine various concepts together to form one preferred subject heading. (e.g., children and terrorism) while thesauri tend to use singular direct terms. Thesauri list not only equivalent terms but also narrower, broader terms and related terms among various preferred and non-preferred (but potentially synonymous) terms, while historically most subject headings did not. For example, the Library of Congress Subject Heading itself did not have much syndetic structure until 1943, and it was not until 1985 when it began to adopt the thesauri type term "Broader term" and "Narrow term". The terms are chosen and organized by trained professionals (including librarians and information scientists) who possess expertise in the subject area. Controlled vocabulary terms can accurately describe what a given document is actually about, even if the terms themselves do not occur within the document's text. Well known subject heading systems include the Library of Congress system, Medical Subject Headings (MeSH) created by the United States National Library of Medicine, and Sears. Well known thesauri include the Art and Architecture Thesaurus and the ERIC Thesaurus. When selecting terms for a controlled vocabulary, the designer has to consider the specificity of the term chosen, whether to use direct entry, inter consistency and stability of the language. Lastly the amount of pre-coordination (in which case the degree of enumeration versus synthesis becomes an issue) and post-coordination in the system is another important issue. Controlled vocabulary elements (terms/phrases) employed as tags, to aid in the content identification process of documents, or other information system entities (e.g. DBMS, Web Services) qualifies as metadata. == Indexing languages == There are three main types of indexing languages. Controlled indexing language – only approved terms can be used by the indexer to describe the document Natural language indexing language – any term from the document in question can be used to describe the document Free indexing language – any term (not only from the document) can be used to describe the document When indexing a document, the indexer also has to choose the level of indexing exhaustivity, the level of detail in which the document is described. For example, using low indexing exhaustivity, minor aspects of the work will not be described with index terms. In general the higher the indexing exhaustivity, the more terms indexed for each document. In recent years free text search as a means of access to documents has become popular. This involves using natural language indexing with an indexing exhaustively set to maximum (every word in the text is indexed). These methods have been compared in some studies, such as the 2007 article, "A Comparative Evaluation of Full-text, Concept-based, and Context-sensitive Search". === Advantages === Controlled vocabularies are often claimed to improve the accuracy of free text searching, such as to reduce irrelevant items in the retrieval list. These irrelevant items (false positives) are often caused by the inherent ambiguity of natural language. Take the English word football for example. Football is the name given to a number of different team sports. Worldwide the most popular of these team sports is association football, which also happens to be called soccer in several countries. The word football is also applied to rugby football (rugby union and rugby league), American football, Australian rules football, Gaelic football, and Canadian football. A search for football therefore will retrieve documents that are about several completely different sports. Controlled vocabulary solves this problem by tagging the documents in such a way that the ambiguities are eliminated. Compared to free text searching, the use of a controlled vocabulary can dramatically increase the performance of an information retrieval system, if performance is measured by precision (the percentage of documents in the retrieval list that are actually relevant to the search topic). In some cases controlled vocabulary can enhance recall as well, because unlike natural language schemes, once the correct preferred term is searched, there is no need to search for other terms that might be synonyms of that term. === Disadvantages === A controlled vocabulary search may lead to unsatisfactory recall, in that it will fail to retrieve some documents that are actually relevant to the search question. This is particularly problematic when the search question involves terms that are sufficiently tangential to the subject area such that the indexer might have decided to tag it using a different term (but the searcher might consider the same). Essentially, this can be avoided only by an experienced user of controlled vocabulary whose understanding of the vocabulary coincides with that of the indexer. Another possibility is that the article is just not tagged by the indexer because indexing exhaustivity is low. For example, an article might mention football as a secondary focus, and the indexer might decide not to tag it with "football" because it is not important enough compared to the main focus. But it turns out that for the searcher that article is relevant and hence recall fails. A free text search would automatically pick up that article regardless. On the other hand, free text searches have high exhaustivity (every word is searched) so although it has much lower precision, it has potential for high recall as long as the searcher overcome the problem of synonyms by entering every combination. Controlled vocabularies may become outdated rapidly in fast developing fields of knowledge, unless the preferred terms are updated regularly. Even in an ideal scenario, a controlled vocabulary is often less specific than the words of the text itself. Indexers trying to choose the appropriate index terms might misinterpret the author, while this precise problem is not a factor in a free text, as it uses the author's own words. The use of controlled vocabularies can be costly compared to free

Mark V. Shaney

Mark V. Shaney is a synthetic Usenet user whose postings in the net.singles newsgroups were generated by Markov chain techniques, based on text from other postings. The username is a play on the words "Markov chain". Many readers were fooled into thinking that the quirky, sometimes uncannily topical posts were written by a real person. The system was designed by Rob Pike with coding by Bruce Ellis. Don P. Mitchell wrote the Markov chain code, initially demonstrating it to Pike and Ellis using the Tao Te Ching as a basis. They chose to apply it to the net.singles netnews group. The program is fairly simple. It ingests the sample text (the Tao Te Ching, or the posts of a Usenet group) and creates a massive list of every sequence of three successive words (triplet) which occurs in the text. It then chooses two words at random, and looks for a word which follows those two in one of the triplets in its massive list. If there is more than one, it picks at random (identical triplets count separately, so a sequence which occurs twice is twice as likely to be picked as one which only occurs once). It then adds that word to the generated text. Then, in the same way, it picks a triplet that starts with the second and third words in the generated text, and that gives a fourth word. It adds the fourth word, then repeats with the third and fourth words, and so on. This algorithm is called a third-order Markov chain (because it uses sequences of three words). == Examples == A classic example, from 1984, originally sent as a mail message, later posted to net.singles is reproduced here: >From mvs Fri Nov 16 17:11 EST 1984 remote from alice It looks like Reagan is going to say? Ummm... Oh yes, I was looking for. I'm so glad I remembered it. Yeah, what I have wondered if I had committed a crime. Don't eat with your assessment of Reagon and Mondale. Up your nose with a guy from a firm that specifically researches the teen-age market. As a friend of mine would say, "It really doesn't matter"... It looks like Reagan is holding back the arms of the American eating public have changed dramatically, and it got pretty boring after about 300 games. People, having a much larger number of varieties, and are very different from what one can find in Chinatowns across the country (things like pork buns, steamed dumplings, etc.) They can be cheap, being sold for around 30 to 75 cents apiece (depending on size), are generally not greasy, can be adequately explained by stupidity. Singles have felt insecure since we came down from the Conservative world at large. But Chuqui is the way it happened and the prices are VERY reasonable. Can anyone think of myself as a third sex. Yes, I am expected to have. People often get used to me knowing these things and then a cover is placed over all of them. Along the side of the $$ are spent by (or at least for ) the girls. You can't settle the issue. It seems I've forgotten what it is, but I don't. I know about violence against women, and I really doubt they will ever join together into a large number of jokes. It showed Adam, just after being created. He has a modem and an autodial routine. He calls my number 1440 times a day. So I will conclude by saying that I can well understand that she might soon have the time, it makes sense, again, to get the gist of my argument, I was in that (though it's a Republican administration). _-_-_-_-Mark Other quotations from Mark's Usenet posts are: "I spent an interesting evening recently with a grain of salt." (Alternatively reported as "While at a conference a few weeks back, I spent an interesting evening with a grain of salt.") "I hope that there are sour apples in every bushel." (see also sour grapes) == History == In The Usenet Handbook Mark Harrison writes that after September 1981, students joined Usenet en masse, "creating the USENET we know today: endless dumb questions, endless idiots posing as savants, and (of course) endless victims for practical jokes." In December, Rob Pike created the netnews group net.suicide as prank, "a forum for bad jokes". Some users thought it was a legitimate forum, some discussed "riding motorcycles without helmets". At first, most posters were "real people", but soon "characters" began posting. Pike created a "vicious" character named Bimmler. At its peak, net.suicide had ten frequent posters; nine were "known to be characters." But ultimately, Pike deleted the newsgroup because it was too much work to maintain; Bimmler messages were created "by hand". The "obvious alternative" was software, running on a Bell Labs computer created by Bruce Ellis, based on the Markov code by Don Mitchell, which became the online character Mark V. Shaney. Kernighan and Pike listed Mark V. Shaney in the acknowledgements in The Practice of Programming, noting its roots in Mitchell's markov, which, adapted as shaney, was used for "humorous deconstructionist activities" in the 1980s. Dewdney pointed out "perhaps Mark V. Shaney's magnum opus: a 20-page commentary on the deconstructionist philosophy of Jean Baudrillard" directed by Pike, with assistance from Henry S. Baird and Catherine Richards, to be distributed by email. The piece was based on Jean Baudrillard's "The Precession of Simulacra", published in Simulacra and Simulation (1981). == Reception == The program was discussed by A. K. Dewdney in the Scientific American "Computer Recreations" column in 1989, by Penn Jillette in his PC Computing column in 1991, and in several books, including the Usenet Handbook, Bots: the Origin of New Species, Hippo Eats Dwarf: A Field Guide to Hoaxes and Other B.S., and non-computer-related journals such as Texas Studies in Literature and Language. Dewdney wrote about the program's output, "The overall impression is not unlike what remains in the brain of an inattentive student after a late-night study session. Indeed, after reading the output of Mark V. Shaney, I find ordinary writing almost equally strange and incomprehensible!" He noted the reactions of newsgroup users, who have "shuddered at Mark V. Shaney's reflections, some with rage and others with laughter:" The opinions of the new net.singles correspondent drew mixed reviews. Serious users of the bulletin board's services sensed satire. Outraged, they urged that someone "pull the plug" on Mark V. Shaney's monstrous rantings. Others inquired almost admiringly whether the program was a secret artificial intelligence project that was being tested in a human conversational environment. A few may even have thought that Mark V. Shaney was a real person, a tortured schizophrenic desperately seeking a like-minded companion. Concluding, Dewdney wrote, "If the purpose of computer prose is to fool people into thinking that it was written by a sane person, Mark V. Shaney probably falls short." A 2012 article in Observer compared Mark V. Shaney's "strangely beautiful" postings to the Horse_ebooks account on Twitter and music reviews at Pitchfork, saying that "this mash-up of gibberish and human sentiment" is what "made Mark V. Shaney so endlessly fascinating".

Enterprise data planning

Enterprise data planning is the starting point for enterprise wide change. It states the destination and describes how you will get there. It defines benefits, costs and potential risks. It provides measures to be used along the way to judge progress and adjust the journey according to changing circumstances. Data is fundamental to investment enterprises. Effective, economic management of data underpins operations and enables transformations needed to satisfy customer demands, competition and regulation. Data warehouse(s) and other aspects of the overall data architecture are critical to the enterprise. EDMworks has created a strategic data planning approach for the Investment Sector. It consists of a planning process, planning intranets, templates and training materials. EDMworks planning process is based on the belief that extensive domain knowledge significantly shortens planning iterations and enables progressively higher quality plans to be produced and implemented. This approach drives the development of an effective and economic enterprise data architecture. Enterprise data planning is based on proven business disciplines. Key architectural layers for data and applications are then added in order to provide an enterprise wide understanding of the uses and interdependencies of data. This enables the definition of the core components of the EDM plan: Industry structure and business objectives Assessment of systems and services Target architecture for applications, data and infrastructure Target organization structures Systems, database, infrastructure and organizational plans Business case, costs, benefits, results and risks. EDMworks uses several components from the Open Systems Group TOGAF enterprise systems planning process. TOGAF acts as an extension to good business planning methods to provide a framework for the development of the systems and data architectural components. == History == James Martin was one of the pathfinders in data planning methodologies. He was one of the first to identify data as being an enterprise wide asset that required management. He developed a series of tools and methods to support that process. Most of the large consulting firms developed their own methods to address the same basic issue. Frequently, their approaches were incorporated into their own branded system development methodologies that encompassed the complete systems development life-cycle. Others, such as Ed Tozer, developed more focused offerings that dealt with the complexities of extracting key business needs from senior management and then defining relevant architectural visions for the specific enterprise. From these various sources, the concepts of Business, Data, Applications and Technology Architectures emerged. The Open Group Architectural Framework (TOGAF) has taken this work forward and has established a sound method in TOGAF version 9. EDMworks approach is to adopt these planning and architectural practices as a basis and then add two additional dimensions to the planning and implementation focus: Domain knowledge of the Investments sector. Investments is a complex global industry with a common set of characteristics about clients, information vendors, competition and regulation. Domain knowledge significantly improves the quality of the planning and implementation processes Development of people and teams. Change is a major feature of in any Enterprise Data Management program and people and teams both need development in order to make EDM effective throughout an organization.