Jekyll2019-12-03T15:02:02+00:00https://www.hideoushumpbackfreak.com/feed.xmlHideous Humpback FreakRandom Musing of an Eccentric Software Aficionado
Dale Alleshousedale@alleshouse.netPsychological Effects of Software Engineering Tenets2019-05-14T00:00:00+00:002019-05-14T00:00:00+00:00https://www.hideoushumpbackfreak.com/2019/05/14/Psychological-Effects-of-Software-Engineering-Tenets<p>Organizations often attempt to define software engineering tenets (aka core
values or principles) that serve as a static mental and behavioral model to
guild employees. Tenets are a powerful tool capable of inspiring people toward a
common goal. However, poorly designed tenets can just as easily unite a
community on a myopic quest toward inefficiency.</p>
<p>Tenets are essentially a simplified model of a problem space that reduces the
cognitive load associated with decision making. The intent is to provide a
shorter path to a good decision. These models often become a deeply ingrained
component of engineers’ psyches which are subject to the human proclivity of
self-defense. Unfortunately, models that are overly-simplistic or inaccurate
provide a shorter path to a bad decision that may be defended with disregard
toward reason. This is why it’s important to avoid tenets that promote silver
bullet thinking or preclude reasoning from the engineering process.</p>
<!--more-->
<h2 id="humans-affinity-toward-their-world-view">Humans’ Affinity Toward Their World View</h2>
<p>Humans have a deeply ingrained psychological need to categorize and simplify the
world around them. Notice the deliberate use of the word need; it’s much more
than a mere idiosyncratic proclivity. Reducing complex phenomena to
comprehensible models that can be conjured up by a single word or phrase makes
the world tractable. Conflating new experiences with known phenomena provides
the ability to quickly predict the behavior of the environment and the objects
therein. The inevitable consequence is that models become part of the psyche: a
deeply ingrained component of one’s world view. The whole of cognition rests
upon this ability. This is the thesis of Douglas Hofstadter’s seminal paper
<em>Analogy as the Core of Cognition</em>. It would be easy to make a case that the
many of humanity’s prodigious accomplishments are a consequence of this need.</p>
<p>Unfortunately, the same case could be made for many of humanity’s most heinous
atrocities. Often, the simplified models humans create to predict the behavior
of their environment are incomplete or inaccurate. These models are often
expanded well beyond what is appropriate (see the concept of illusory
correlation as defined by Stanvoich). To make matters worse, humans pugnaciously
defend their world view as a self-defense mechanism (refer to Jonas Kaplan’s
body of work for a more comprehensive treatment of the theory). Self-defense
operates at a baser level with indifference toward reason. This is essentially
what Max Planck was referring to when he said, “An important scientific
innovation rarely makes its way by gradually winning over and converting its
opponents: What does happen is that the opponents gradually die out”. The
Oatmeal illustrates this concept beautifully with his <em>You’re not going to
believe what I’m about to tell you</em> comic (<a href="https://theoatmeal.com/comics/believe">The
Oatmeal</a>).</p>
<p>Human’s ability to overcome this self-defense mechanism is a debate that reaches
back to the 1700s with Kant’s subject-object distinction. The argument comprises
the very essence of the modern/post-modern philosophical dichotomy. The
conclusions that have evolved out of these two lines of thoughts are truly
astounding. Drawing any sort of conclusions about objective reality is well
beyond the scope of this humble article. That is best left to the purview of
philosophers. However, for the purpose at hand, there are a few salient points
that can be accepted as fact:</p>
<ul>
<li>Humans need models to make the world tractable</li>
<li>Models can be incomplete or inaccurate</li>
<li>Models become part of human’s world view</li>
<li>Humans default to a bellicose attitude when protecting their world view</li>
</ul>
<p>To relate these concepts to defining software design tenets, consider the
interactions between software professionals and projects as a microcosm of the
interactions that humans have with the world.</p>
<h2 id="the-software-engineering-microcosm">The Software Engineering Microcosm</h2>
<p>Modern software professionals find themselves submerged in ambiguity. An
epiphenomenon of the industry’s relative adolescence is that the marketplace of
ideas hasn’t had enough time to standardize on best practices. There are
seemingly countless competing concepts battling it out as you read this. Unless
you can predict the future, there is no way to tell which ones will emerge
victorious. Furthermore, the industry is severely lacking in meaningful
empirical studies so it’s hard to even define what “victory” means. Coupling all
of this with ever-shrinking timelines, ever-evolving external security threats,
ever-expanding feature requirements, and relentless pressure to generate revenue
and things quickly appear unmanageable to the ablest of people. The plight is
analogous to the struggle of an underprivileged juvenile trying to build
meaningful models of the world without proper guidance. Surely not an impossible
task, but success (depending on how you define it) is statistically improbable.</p>
<p>Make no mistake about it, the software industry needs simplified models in order
to impose order on the chaos. The problem is, the industry has evolved a
particularly pernicious tendency toward overly simplistic models. Typically, the
models look something like this: Company X is an industry leader, they use
Technology/Pattern Y, therefore Pattern Y is the solution to software
productivity. Unfortunately, software professionals are still questing for the
proverbial “silver bullet” which Fred Brooks denies the existence of at all.</p>
<p>Examples of fashionable mental models following the pattern above are virtually
endless. Those who were programming in the early nineties (as was this author)
undoubtedly remember the promised panacea of CASE tools. Next came n-tier
architecture with its unfettered flexibility. Many espoused Ruby and other
dynamically typed languages as the one true way because they lowered the
barriers to entry. Conversely, proponents of languages such as Haskell
proselytized expressive types systems as the end of the software crisis. There
is undoubtedly much missing from the list. The latest silver bullet fervor is
channeled toward micro-services and twelve-factor apps.</p>
<p>Without a doubt, there is someone reading this who is feeling dissonance because
their world view has just been challenged. The author sympathizes with said
reader because he has been in his/her shoes. As a young engineer, the author was
challenged by a senior engineer on the efficacy of n-tier architecture. The
immediate response was, “You are simply afraid of change and new paradigms. This
is the RIGHT way to build software”. It’s amazing how time and experience
changes one’s perspective.</p>
<p>To summarize the key points:</p>
<ul>
<li>There is much ambiguity around best practices in modern software engineering</li>
<li>Software professionals often adopt mental models that equate to silver bullet
thinking; and this is irresponsible</li>
<li>When mental models become part of a software professional’s world view they
become defensive of them; and this reflects personal bias which must be
acknowledged and questioned if one is to every grow because nothing is
static</li>
</ul>
<p><em>Don’t make the mistake of interpreting this piece as an attack on any
particular architectural pattern</em>. Every model mentioned has undeniable merit.
The underlying fallacy stems from silver bullet thinking that espouses the
<em>RIGHT</em> (and suggestively ONLY) way to build software.</p>
<h2 id="the-right-way-to-build-software">The <em>RIGHT</em> Way to Build Software</h2>
<p>A reasonable person could challenge the thesis of this article by enumerating
the benefits of their said model and asserting that it may not be a silver
bullet, but it is certainly a superior approach - so it is, therefore, the right
way to build software. The glaring fallacy in that reasoning is that it attaches
value propositions outside of a context. This is what Micheal Glinter would
classify as Level One Thinking.</p>
<p>Robert Glass has written extensively on this topic in several books and
articles. In his book <em>Software Creativity 2.0</em> he states the following
concerning silver bullet thinking: “we have those who proclaim each new idea
that comes along as the solution to software productivity. … These people, I
would assert, are the level one thinkers. Some perceive them as strong because
they see a solution clearly and move swiftly toward it. Others see them as
simplistic, for they ignore the complexity in the problem and seem unable to
accept the ambiguity.”</p>
<p>Unfortunately, the second law of thermodynamics applies to software: there is no
such thing as a free lunch. Architectural patterns are nothing more than a
series of trade-offs. Two pertinent points follow from this assertion. The first
is that any person who only enumerates advantages without also explaining the
cost of said benefits, <em>simply DOES NOT have a deep understanding of the
architectural pattern</em> or is selling something. The second is that <em>it is
nonsensical to weigh trade-offs outside of a context</em>. Each project has unique
requirements that give meaning to trade-offs. No trade-off is always best, but
it may be best for the use case in question.</p>
<p>Evaluating trade-offs outside of a context is not only nonsensical, it’s also
destructive and the pernicious effects are virtually never-ending. It diverts
the focus from the software’s true purpose of providing value to end users to
arbitrary patterns. Concisely stated, it <em>prioritizes the process above the
product</em>. This equates to wasted efforts and resources. Furthermore, once
engineers’ adopt an overly-simplified mental model, they are psychologically
incentivized to pursue and perpetuate it.</p>
<p>To summarize:</p>
<ul>
<li>There is no “right” way to build software</li>
<li>Every approach is a series of contextual trade-offs</li>
<li>It’s nonsensical to evaluate trade-offs outside of a context</li>
<li>Evaluating trade-offs outside a context prioritizes the process over the
product</li>
</ul>
<p>In the event that an organization must communicate an affinity toward a
trade-off, these should be considered non-functional requirements and not
tenets.</p>
<h2 id="non-functional-requirements-are-not-tenets">Non-Functional Requirements are <em>NOT</em> Tenets</h2>
<p>It makes sense that some organization would want to communicate an affinity
toward a side of a trade-off. For instance, financial institutions may naturally
prioritize security over availability. Specifying these preferences is
particularly important because they typically never come to bear from iterative
development. However, it makes more sense to express these as non-functional
requirements instead of tenets. This may seem like nothing more than linguistic
gymnastics; however, the psychological effect is very real. Consider the meaning
of the two words:</p>
<dl>
<dt>Tenet</dt>
<dd>
<p>a principle, belief, or doctrine generally held to be true (Merriam-Webster)</p>
</dd>
<dt>Non-Functional Requirement</dt>
<dd>
<p>a requirement that specifies criteria that can be used to judge the operation
of a system, rather than specific behaviors (Wikipedia)</p>
</dd>
</dl>
<p>Specifying a trade-off as a tenet removes reason from the engineering process
and has the unintended consequence of creating an overly-simplistic model of the
software engineering world. Conversely, specifying a non-functional requirement
biases the engineer toward an engineering mindset.</p>
<p>There is one key takeaway here:</p>
<ul>
<li>Specifying a non-functional requirement as a tenet has the unintended
psychological effect of removing reason from the engineering process.</li>
</ul>
<p>Hopefully, this section made the case that it’s undesirable to specify
non-functional requirements as tenets and the previous section established that
silver bullet thinking is destructive. The question remains, what exactly is a
good software engineering tenet?</p>
<h2 id="anatomy-of-healthy-tenets">Anatomy of Healthy Tenets</h2>
<p>A healthy tenet is ambiguous enough to facilitate the engineering process yet
precise enough to focus effort. One of the best examples of a healthy tenet is
Amazon’s Customer Obsession principle:</p>
<blockquote>
<p>Leaders start with the customer and work backwards. They work vigorously to
earn and keep customer trust. Although leaders pay attention to competitors,
they obsess over customers.</p>
</blockquote>
<p>The principle does not bias engineers toward any particular technology or
pattern. Likewise, it doesn’t create a cognitive bias toward an approach.
However, it does bias thinking toward the deep compilation of use cases. In
short, it keeps the focus on the product, not the process.</p>
<p>Another perfect example of a good tenet is Amazon’s Invent and Simplify
principle:</p>
<blockquote>
<p>Leaders expect and require innovation and invention from their teams and
always find ways to simplify. They are externally aware, look for new ideas
from everywhere, and are not limited by “not invented here”. Because we do new
things, we accept that we may be misunderstood for long periods of time.</p>
</blockquote>
<p>The attribute that makes this tenet great is that it encourages the relentless
pursuit of the perfect balance of trade-offs. It creates a bias toward
simplicity. It does not remove reasoning from the process; rather it forces
proponents to think critically. These are the kind of tenets that create
harmony.</p>
<p>As a small aside, make sure to avoid the trap of copying the tenets of industry
leaders. The big technology companies didn’t get where they are by forcing
solutions into their contexts. They aggressively engineered solutions to meet
their contextual needs. It only makes sense to be aware of the practices of
successful enterprises; however, accept that they operate in a different context
from yours. Create solutions for YOUR context.</p>
<p>To summarize, good tenets:</p>
<ul>
<li>Do not blindly prescribe any particular technology, approach, or pattern</li>
<li>Require proponents to think critically at all times, this includes questioning
self to guard against one’s own bias</li>
<li>Create a bias toward simplicity but not over-simplicity</li>
<li>Focus on products, not process</li>
</ul>
<h2 id="wrapping-it-up">Wrapping it Up</h2>
<p>Human cognition is a fascinating subject with profound implications for the
engineering process. Regardless of an individual’s reasoning ability, they are
limited by their cognitive architecture and subject to the perils of defending
their contextual world view. Accepting software engineering tenets contributes
to an engineer’s world view. This is why it is important to create tenets
without unintended negative biases (to the extent that’s possible) and to
revisit and revalidate them over time.</p>
<p>Tenets that perpetuate “silver bullet” thinking or establish non-functional
requirements are particularly noxious. They tend to exclude critical thinking
from the engineering process. Additionally, they place emphasis on the process
rather than the product. Conversely, healthy tenets refrain from prescribing any
particular technology, approach, or pattern. They force proponents to think
critically and choose the perfect balance of trade-offs. With a bias toward
simplicity, the focus remains firmly on the product.</p>Dale Alleshousedale@alleshouse.netOrganizations often attempt to define software engineering tenets (aka core values or principles) that serve as a static mental and behavioral model to guild employees. Tenets are a powerful tool capable of inspiring people toward a common goal. However, poorly designed tenets can just as easily unite a community on a myopic quest toward inefficiency. Tenets are essentially a simplified model of a problem space that reduces the cognitive load associated with decision making. The intent is to provide a shorter path to a good decision. These models often become a deeply ingrained component of engineers’ psyches which are subject to the human proclivity of self-defense. Unfortunately, models that are overly-simplistic or inaccurate provide a shorter path to a bad decision that may be defended with disregard toward reason. This is why it’s important to avoid tenets that promote silver bullet thinking or preclude reasoning from the engineering process.Zero to DevOps in Under an Hour with Kubernetes (CodeMash Conference Talk)2018-07-08T00:00:00+00:002018-07-08T00:00:00+00:00https://www.hideoushumpbackfreak.com/2018/07/08/Zero-to-DevOps<p>Below is a video of a talk I gave a CodeMash in February 2018.</p>
<div><div class="extensions extensions--video">
<iframe src="https://www.youtube.com/embed/IW4qmmEWYhY?rel=0&showinfo=0" frameborder="0" scrolling="no" allowfullscreen=""></iframe>
</div></div>
<p>Slides; <a href="http://slides.com/dalealleshouse/kube-pi#/">http://slides.com/dalealleshouse/kube-pi#/</a></p>
<p>Code: <a href="https://github.com/dalealleshouse/zero-to-devops/tree/pi">https://github.com/dalealleshouse/zero-to-devops/tree/pi</a></p>Dale Alleshousedale@alleshouse.netBelow is a video of a talk I gave a CodeMash in February 2018. Slides; http://slides.com/dalealleshouse/kube-pi#/ Code: https://github.com/dalealleshouse/zero-to-devops/tree/piGödel, Artificial Intelligence, and Confusion2017-11-21T00:00:00+00:002017-11-21T00:00:00+00:00https://www.hideoushumpbackfreak.com/2017/11/21/Godel-AI-Confusion<p>Sentient software is the hot topic as of late. Speculative news about Artificial
Intelligence (AI) systems such as Watson, Alexa, and even autonomous vehicles
are dominating social media. It’s feasible that this impression is nothing more
than Baader-Meinhof phenomenon (AKA frequency illusion). However, it seems that
the populace has genuine interest in AI. Questions abound. Are there limits? Is
it possible to create a factitious soul? Gödel’s incompleteness theorem is at
the core of these questions; however, the conclusions are cryptic and often
misunderstood.</p>
<p>Gödel’s incompleteness theorem is frequently adduced as proof of antithetical
concepts. For instance, Roger Penrose’s book Shadows of the Mind claims that the
theorem disproves the possibility of sentient machines (Penrose, 1994, p. 65).
Douglas Hofstadter asserts the opposite in his book, I Am Strange Loop
(Hofstadter, 2007). This article aims to provide a cursory view of the theorem
in laymen’s terms and elucidate its practical implications on AI.</p>
<!--more-->
<h2 id="context">Context</h2>
<p>Gödel’s Incompleteness Theorem is best understood within its historical context.
This section covers requite concepts and notable events to provide the reader
with adequate background knowledge. This is not meant to be comprehensive
coverage of the material: rather it is stripped down to essentials.</p>
<h3 id="the-challenge">The Challenge</h3>
<p>The mathematics community was never filled with more hope than at the turn of
the twentieth century. On August 8th, 1900, David Hilbert gave his seminal
address at the Second International Congress of Mathematics in which he
declared, “in mathematics there is no ignorabimus” (Petzold, 2008, p. 40).
Ignorabimus is a Latin word meaning “we shall not know”. Hilbert believed that,
unlike some other branches of science, all things mathematical were knowable.
Furthermore, he framed a plan to actualize a mathematical panacea.</p>
<p>In this address, Hilbert outlined ten open problems and challenged the
mathematics community to solve them (this was a subset of twenty-three problems
published by Hilbert). The problem of relevance for this article is the second
which is entitled, The Computability of Arithmetical Axioms. Hilbert’s second
problem called for the axiomatization of real numbers “to prove that there are
no contradictory, this is, that a finite number of logical steps based upon them
can never lead to contradictory results” (Petzold, 2008, p. 41). More concisely,
Hilbert wished to axiomatize number theory.</p>
<p>The following sections delve into axiomatization. However, a pertinent idea here
is the phrase “finite number of logical steps”. In modern nomenclature, this is
known as algorithmic. Hilbert, along with his contemporaries, believed that
every mathematical problem was solvable via an algorithmic process. (Petzold,
2008) This is a key concept that will be revisited after exploring
axiomatization.</p>
<h3 id="axiomatization">Axiomatization</h3>
<p>Stated concisely, axiomatization is a means of deriving a system’s theorems by
logical inferences based on a set of axioms. Axioms are unprovable rules that
are self-evidently true. The most well-known axiomatized system is Euclidean
geometry; therefore, it serves as an archetype for understanding axiomatic
systems. The whole of Euclidean geometry is based on five axioms.</p>
<ol>
<li>A straight-line segment can be drawn joining any two points.</li>
<li>Any straight-line segment can be extended indefinitely in a straight line.</li>
<li>Given any straight-line segment, a circle can be drawn having the segment as
radius and one endpoint as center.</li>
<li>All right angles are congruent.</li>
<li>If two lines are drawn which intersect a third in such a way that the sum of
the inner angles on one side is less than two right angles, then the two
lines inevitably must intersect each other on that side if extended far
enough.<br />
(Wolfram Research, Inc., 2017)</li>
</ol>
<p>As a small aside, the fifth axiom is also known as the parallel postulate. This
has the been the subject of mathematical quandary for centuries. It is highly
recommended that the enthusiastic reader perform additional research on the
subject.</p>
<p>These five axioms form the foundation of geometry. Pythagorean theorem, Pons
Asinorum, Congruence of triangles, Thales’ theorem, and countless others are
derived via logical inferences based on the assumption that these
self-evidentiary axioms are true. Axioms provide a solid foundation for a
system, much like the cornerstone of a building.</p>
<p>Another key concept introduced in the previous paragraph is logical inferences.
It’s not enough to have a firm foundation of axioms. Theorems derived from the
axioms must be likewise sound and logical inference offers a guarantee of said
soundness.</p>
<h3 id="logical-inference">Logical Inference</h3>
<p>The process of connecting axioms to theorems cannot rely on intuition in any
way. This is to say that they are definitive rules and constructs in which
logical inference can be validated. This is important because the legitimacy of
axioms is irrelevant if conclusions drawn from them are not completely
consistent. A strong, stable, and trusted system must be composed of theorems
that use valid logical inferences stemming from axioms.</p>
<p>It is beyond the scope of this blog post to give even a cursory explanation of
logical systems of inference. However, it’s important for the reader to
understand that formal logic has stringent rules and notations much like any
mathematical system. Logic statements are written and manipulated like any other
mathematical formulas. This allows for the creation of proofs that cement the
validity from the bottom up.</p>
<p>Each theorem is analogous to a brick in a house. Because the theorem sits firmly
on either an axiom or another theorem planted on an axiom, it’s validity is
confirmed. This is commonly known as infinite regress. All the theorems taken
together form a strong and stable system capable of being trusted. Formalism
expands on the concept.</p>
<h3 id="formalism">Formalism</h3>
<p>Recall the Computability of Arithmetical Axioms problem outlined in The
Challenge section. Hilbert envisioned Formalism as the solution to this problem.
Formalism, as conceived by Hilbert, is a “system comprised of definitions,
axioms, and rules for constructing theorems from the axioms” (Petzold, 2008, p.
45). It is often described as a sort of metamathematics. Hilbert envisioned a
formal logic language where axioms are represented as strings and theorems are
derived by an algorithmic process. These concepts were introduced in the
previous two chapters. A new concept to this section is the qualities that such
a system must possess.</p>
<p>For a system, such as formalism, to truly axiomatize the whole of arithmetic, it
must have four qualities which are outlined below.</p>
<ul>
<li>Independence – There are no superfluous axioms.</li>
<li>Decidability – A algorithmic process for deriving the validity of formulas.</li>
<li>Consistency – It is NOT possible to derive two theorems that contradict one
another.</li>
<li>Completeness – Ability to derive ALL true formulas from the axioms.<br />
(Petzold, 2008, p. 46)</li>
</ul>
<p>As a small aside, there is a fair bit of legerdemain happening here. The
concepts of truth, formulas, theorems, and proof are purposely glossed over to
avoid minutia. Curious readers are encouraged to investigate further.</p>
<p>The two qualities that are particularly cogent to Gödel’s incompleteness theorem
are consistency and completeness. Luckily, they are both self-explanatory. A
system that is both complete and consistent will yield all possible true
formulas, none of which are contradictory.</p>
<h3 id="why">Why?</h3>
<p>The truth is that axiomatization is a fastidious process that can seem
maddeningly pedantic. One may be forced to question the very premise that it is
a good thing. One can further postulate that simple human intuition is
sufficient. However, recall the concept of infinite regress called out in the
last paragraph of the Logical Inference section. New theorems are built upon
existing theorems. Without stringent formal logic rules, systems become a
“house of cards”. Mistakes found in foundational theorems can bring the entire
system crashing down.</p>
<p>An archetypal example is Cantor’s set theory. The details of the theory are
largely irrelevant to this line of inquiry, but the curious reader should refer
to this set of blog posts for more information. In short, set theory took the
mathematical world by storm. Countless mathematicians augmented it by building
new abstractions on top of it. Bertrand Russel discovered a fatal flaw known as
Russel’s Paradox which brought the system down like a proverbial “house of
cards”. Formalism is meant to avoid similar debacles.</p>
<h3 id="principia-mathematica">Principia Mathematica</h3>
<p>The Principia Mathematica is an infamous three-volume treatise by Alfred North
Whitehead and Bertrand Russell published in 1910, 1912, and 1913. It is a truly
herculean attempt to formalize the whole of arithmetic. The work is dense and
inaccessible to even most mathematicians (Nagel & Newman, 2001). The system set
forth sets the stage for Gödel’s incompleteness theorem.</p>
<h2 id="incompleteness-theorem">Incompleteness Theorem</h2>
<p>In 1931, Kurt Gödel published a seminal, albeit recondite, paper entitled On
Formally Undecidable Propositions of Principia Mathematica and Related Systems.
The paper dismayed the whole of the mathematical community despite its esoteric
content. It not only trampled the validity of Principia Mathematica, it proved
that such a system isn’t achievable by any means. The implication being that
Hilbert’s second problem, The Computability of Arithmetical Axioms, will never
have a satisfactory solution.</p>
<p>In short, Gödel proved that any system complex enough to encompass simple
arithmetic cannot be both complete and consistent as defined in the Formalism
section. Through a clever method of converting logical expressions to numbers,
the proof showed that any such system will enable the creation of a
self-referential statement in the form of “this statement is false”.</p>
<p>The previous paragraph is a blatant over-simplification of Gödel’s
incompleteness theorem. The intimate details of the proof are well beyond the
scope of this humble article. As mentioned so many times throughout this work,
the reader is encouraged to continue research independently. On a positive note,
the arcane details are not requisite for comprehension of the implications.</p>
<h2 id="implications">Implications</h2>
<p>In short, the implications of Gödel’s Incompleteness Theorem are nothing more
than that an axiomatic system of logic cannot be both complete and consistent.
Expanding on that, it is not possible to derive an algorithm that will generate
all possible proofs of a formalized system. One can then infer that it is not
possible to write a computer program to generate said proofs.</p>
<p>There have been countless extrapolations based on the implications stated above.
For instance, a commonly adduced argument is that there are more truths in the
universe than there are proofs. Likewise, there are some things that are
obviously true that cannot be formally proven. While these are both true, be
careful not to fall into the enticing trap of applying the rule to anything
outside of axiomatic systems of logic.</p>
<h2 id="why-the-confusion">Why the Confusion?</h2>
<p>Although it’s a rather unsatisfying observation, the reality is that Gödel’s
proofs are onerous to all but accomplished logicians. Despite this, the
implications are far reaching. This situation creates a particularly fertile
breeding ground for misconceptions. Many venerated experts within other
disciplines attempt to apply the theorem by fallacious means.</p>
<p>A cursory Google search for “Gödel’s incompleteness theorem and God” will yield
seemingly boundless results with varied interpretations. The fact of the matter
is, the theorem strictly applies to formal axiomatic systems of logic. It does
not apply to religious texts. Likewise, it has no implications on the validity
of the afterlife or mystical intuition. (Tieszen, 2017, p. Kindle Loc. 1173)</p>
<p>As an example, Gödel’s ontological argument is often cited by theists because it
formally proves the existence of God. Given the description, it is easy to see
how someone ignorant of formal logical proofs could draw fallacious conclusions.
As stated previously, Gödel’s proofs apply exclusively to formal axiomatic
systems of logic. The concept of God is far from this. Gödel himself said that
“it was undertaken as a purely logical investigation, to demonstrate that such a
proof could be carried out on the basis of accepted principals of formal logic”
(Tieszen, 2017, p. Kindle Loc. 2158). He also hesitated to publish “for fear
that a belief in God might be ascribed to him” (Tieszen, 2017, p. Kindle Loc.
2158).</p>
<p>The cogent point is that it is easy to misinterpret the significance of Gödel’s
work. It is difficult for anyone lacking a strong background in mathematical
logic to draw valid conclusions based on the incompleteness theorem. Gödel’s
work is best confined to scientific contexts.</p>
<h2 id="implications-for-artificial-intelligence">Implications for Artificial Intelligence</h2>
<p>The thesis of this work is to define the implications of Gödel’s incompleteness
theorem on AI. Unfortunately, a surfeit of background concepts is requisite to
comprehension and the author humbly apologizes for the necessary discomfort.
Possibly more disappointing is that the verdict is not as definitive as one may
suppose as this section explains.</p>
<p>One thing is definite, it is not possible to use a computer to automatically
derive proofs from an axiomatic system. Hilbert’s dream of automated
formalization is inert. On the bright side, if it were many mathematicians would
be out of work. Some claim, as does Roger Penrose, that this necessarily
precludes any possibility of AI within the current computational model. Consider
this, a human can necessarily comprehend some truths that a machine cannot. The
insinuation is that humans are endowed with creativity that is not obtainable by
a machine. Mr. Penrose postulates that this is a quantum effect that is beyond
our current understanding. (Penrose, 1994)</p>
<p>Douglas Hofstadter passionately refutes Roger Penrose’s claims. He believes that
the said limits stem from a fundamental misunderstanding of how the brain works
and presents a compelling model of consciousness in his book, I Am Strange Loop
(Hofstadter, 2007). Theorem proving is by no means the only way to make a
machine “think”. “The human mind is fundamentally not a logic engine but an
analogy engine, a learning engine, a guessing engine, and esthetics-driven
engine, a self-correcting engine” (Nagel & Newman, 2001, p. Kindle Loc. 146).
From this frame of reference, Gödel’s incompleteness theorem doesn’t apply to
AI.</p>
<p>Penrose and Hofstadter sit among varied experts with similar opinions. With the
considerable amount of resources funneled into AI projects, the final verdict
will be decided in due course of time. Not that this should sway the reader in
any way, but the author tends to side with Mr. Hofstadter. The reader is
encouraged to do their own research and form their own opinions.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Gödel’s incompleteness theorem is inextricably associated with philosophy,
religion, and the viability of Artificial Intelligence (AI). However, Gödel’s
work is in a recondite field and its applicability beyond axiomatic systems of
logic is perplexing and often misapplied. In the final analysis, the theorem’s
only definitive assertion is that it is not possible for an axiomatic system of
logic to be both consistent and complete. Many experts make conflicting
ancillary claims and it’s difficult to draw any absolute conclusions.</p>
<p>This article presents a simplistic high-level view of Gödel’s incompleteness
theorem aimed at the novice with limited exposure. It is highly recommended that
readers use this as a starting point for much deeper exploration. The books
listed in the bibliography are all excellent references for further research.</p>
<h2 id="biography">Biography</h2>
<p>Hofstadter, D. (2007). I Am A Strange Loop. Retrieved 8 27, 2017<br />
Nagel, E., & Newman, J. R. (2001). Gödel’s Proof: Edited and with a New Foreword
by Douglas R. Hofstadter. (D. Hofstadter, Ed.) New York University Press, NY.
Retrieved 8 27, 2017<br />
Penrose, R. (1994). Shadows of the Mind. Oxford University Press p. 413.
Retrieved 8 27, 2017<br />
Petzold, C. (2008). The Annotated Turing. Indianapolis: Wiley Publishing, Inc.<br />
Tieszen, R. (2017). Simply Gödel. New York: Simply Charly.<br />
Wolfram Research, Inc. (2017, October 30). Euclid’s Postulates. Retrieved from
Wolfram Math World: http://mathworld.wolfram.com/EuclidsPostulates.html</p>Dale Alleshousedale@alleshouse.netSentient software is the hot topic as of late. Speculative news about Artificial Intelligence (AI) systems such as Watson, Alexa, and even autonomous vehicles are dominating social media. It’s feasible that this impression is nothing more than Baader-Meinhof phenomenon (AKA frequency illusion). However, it seems that the populace has genuine interest in AI. Questions abound. Are there limits? Is it possible to create a factitious soul? Gödel’s incompleteness theorem is at the core of these questions; however, the conclusions are cryptic and often misunderstood. Gödel’s incompleteness theorem is frequently adduced as proof of antithetical concepts. For instance, Roger Penrose’s book Shadows of the Mind claims that the theorem disproves the possibility of sentient machines (Penrose, 1994, p. 65). Douglas Hofstadter asserts the opposite in his book, I Am Strange Loop (Hofstadter, 2007). This article aims to provide a cursory view of the theorem in laymen’s terms and elucidate its practical implications on AI.Diagonalization?2017-02-24T00:00:00+00:002017-02-24T00:00:00+00:00https://www.hideoushumpbackfreak.com/2017/02/24/Diagonalization<p>The goal of this article is to provide laymen with a conceptual understanding
of diagonalization. Those interested in a deep dive full of mathematical jargon
will be sorely disappointed. However, this piece is the perfect resource for a
general understanding of the topic devoid of the more arcane details. Unlike
the majority of my writing, this is not directly applicable to the daily
responsibilities of software professionals. It is purely an endeavor to satisfy
intellectual curiosity.</p>
<!--more-->
<h2 id="why">Why?</h2>
<p>The impetus for this writing comes from a colleague who contacted me after
reading my blog series on Set Theory (Set Theory Defined, Set Operations, When
Sets Collide). The posts made pithy mention of Cantor’s diagonalization proof
with implications on infinite cardinality. My friend’s search for a concise
explanation proved to be unfruitful. The conversation naturally progressed
toward Alan Turing’s seminal paper: On Computable Numbers, which also employs a
diagonalization proof. Cantor and Turing both played a major part in shaping
computer science. Therefore, although it is not likely that the majority of
software professionals will ever employ diagonalization, it’s a crucial part of
computing history.</p>
<h2 id="what-are-we-trying-to-prove">What Are We Trying to Prove?</h2>
<p>Diagonalization is a mathematical proof demonstrating that there are certain
numbers that cannot be enumerated. Stated differently, there are numbers that
cannot be listed sequentially. Consider all the numbers on the number line as
shown in Figure One – Number Line.</p>
<p><img src="/assets/images/diagonalization/figure1.png" alt="Figure 1" class="img-center" /></p>
<p>First consider the set of positive whole numbers including zero. These are
known as natural or counting numbers and are denoted as <script type="math/tex">\mathbb{N}</script> . Most
kindergarten curriculum teaches how to enumerate this set: starting with zero
add one to the current number to get the next number ad infinitum.</p>
<p>Adding negative numbers to <script type="math/tex">\mathbb{N}</script> produces the set of integers denoted
by <script type="math/tex">\mathbb{Z}</script> . Again, this set is also easy to enumerate by simply listing
it as follows: <script type="math/tex">0, 1, -1, 2, -2, 3, -3, …</script></p>
<p>Now consider expanding on <script type="math/tex">\mathbb{Z}</script> by adding fractions to create the set
of rational number denoted as <script type="math/tex">\mathbb{Q}</script> . The term rational signifies that
a number can be expressed as a ratio such as <script type="math/tex">1/2</script> or <script type="math/tex">23/345</script> . These
numbers fit between the whole number on the number line and there is an
infinite amount of fractional numbers between each set of natural numbers. That
is to say, regardless of the location of two rationals on the number line, it’s
always possible to find another number between them. With some ingenuity, these
numbers can also be enumerated in several different ways. Enumerating rational
numbers, while fascinating, is beyond the scope of this post. The reader is
encouraged to either just accept my word as fact or do research.</p>
<p>Although it seems as if we’ve run out room on the number line, that isn’t
actually the fact. There is another class of number that has been baffling
mathematicians throughout the ages: irrational. It’s a bit perplexing, but
irrationals fit between rationals on the number line (no matter how many times
I think about that, it amazes me). Grade school curriculum typically introduces
the concept with renowned numbers such as <script type="math/tex">\pi</script> or <script type="math/tex">e</script>. These are numbers
that cannot be expressed as a ratio. The decimal representation consists of an
infinite series of digits with no repeating pattern. Any calculations involving
irrationals are approximations because it’s impossible to express them in a
finite context. Adding these to <script type="math/tex">\mathbb{Q}</script> produces the set of real numbers
denoted as <script type="math/tex">\mathbb{R}</script>. Irrational numbers are the target of our
inquisition.</p>
<p>As a matter of note, the set of irrational numbers can be further divided into
the sets of algebraic and transcendental numbers. Algebraic numbers can in fact
be enumerated. However, this is a bit of minutia that isn’t really necessary
for understanding diagonalization. Once again, the curious reader is encouraged
to rely on Google for further inquiry.</p>
<p>The question is, how is it possible to prove that irrational numbers are not
enumerable. With an understanding of the problem, we can turn our attention to
the solution which is diagonalization.</p>
<h2 id="reductio-ad-absurdum">Reductio Ad Absurdum</h2>
<p>Diagonalization is a type of proof known as reductio ad absurdum which is Latin
for reduction to absurdity. It is common amongst mathematicians and
philosophers alike. The premise is to first assume a proposition is true and
then disprove it via deductive reasoning thus reducing it to an absurd
conclusion.</p>
<p>One popular example of a reductio ad absurdum proof is that there is no
smallest fractional number. Assume there is such a number: it can be divided by
two to create a smaller number. Therefore, the original assumption is absurd.
Another illustration is an alibi. First assume the suspect committed the crime.
If the accused is known to be at a different location when the crime took
place, it’s absurd to assume that they were also at the scene of the crime.</p>
<h2 id="diagonalization">Diagonalization</h2>
<p>Having addressed all the introductory trivialities, it’s time to get to the
point. The diagonalization proof is as follows. First assume that it is
possible to enumerate all irrational numbers. If this is true, it should be
impossible to devise a number that is not included in this list. Examine Figure
Two – Diagonalization and stretch the mind to imagine that this is in fact the
list of all irrational numbers: the list is infinitely long and each number
expands on endlessly. Next, draw a diagonal line down the center of the list
and write down the resulting infinite number. In this case, the number is
<script type="math/tex">0.13579135…</script>. Next add 1 to each digit expect in the case of nine which
becomes a zero. This results is the number <script type="math/tex">0.24680246…</script>. Is this number
contained in the list? It’s obviously not the first number because the first
digit does not match. The same holds true for the second number because the
second digit has to be different. Continue this line of logic for every number
and it’s obvious that the devised number is not in the list. The reader should
take a few minutes to let that sink in.</p>
<p><img src="/assets/images/diagonalization/figure2.png" alt="Figure 2" class="img-center" /></p>
<p>Keep in mind, this is purely a thought experiment. Obviously, Figure Two –
Diagonalization is not an infinite list and each number is not truly
irrational. It’s impossible to construct such a list in a finite context.
However, the line of logic holds true.</p>
<p>It is common to wonder why diagonalization does not apply to <script type="math/tex">\mathbb{Q}</script>.
The concise answer is that those numbers have finite digits and irrationals do
not.</p>
<h2 id="implications">Implications</h2>
<p>Accepting that the diagonalization proof is valid, it has some profound
implications. At first glance, it’s difficult to understand how the fact that
it’s impossible to enumerate irrational numbers has bearing on the world in any
way. However, many people have derived some amazing conclusions. Cantor showed
that there are in fact multiple infinities. Turing used diagonalization to
prove the limits of computability. It’s even been employed by philosophers to
prove that there are an insufficient number of proofs to prove all the truths
in the universe. More concisely, some truths are unproveable. The implications
lead down an exceedingly dark and deep rabbit hole.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Diagonalization is a reductio ad absurdum proof that demonstrates the
impossibility of enumerating irrational numbers. It is relatively easy for
non-mathematicians to understand. While only tangentially related to software
engineering, it’s a fascinating concept that sheds light on the foundations of
computing and indeed the world.</p>
<p>As always, thank you for taking the time to read this article. Please feel free
to contact me with any questions or concerns.</p>Dale Alleshousedale@alleshouse.netThe goal of this article is to provide laymen with a conceptual understanding of diagonalization. Those interested in a deep dive full of mathematical jargon will be sorely disappointed. However, this piece is the perfect resource for a general understanding of the topic devoid of the more arcane details. Unlike the majority of my writing, this is not directly applicable to the daily responsibilities of software professionals. It is purely an endeavor to satisfy intellectual curiosity.Just Enough Set Theory - When Sets Collide (Part 3 of 3)2017-02-22T00:00:00+00:002017-02-22T00:00:00+00:00https://www.hideoushumpbackfreak.com/2017/02/22/When-Sets-Collide<p>Welcome to the final installment of this three-part series on set theory. The
first piece, <a href="/2017/02/05/Set-Theory-Defined.html">Set Theory Defined</a>, detailed requisite foundational knowledge. The second article, <a href="/2017/02/19/Set-Operations.html">Set
Operations</a>, outlined some beneficial
set algorithms. This post develops the concepts laid out in the first two;
therefore, it is highly recommended that readers begin there.</p>
<p>Individual sets have many useful properties; however, preforming operations on
multiple sets provides even greater utility. This piece outlines four such
operations. Each operation provides a concise means for addressing common
programming problems that virtually all software professionals encounter. There
is a brief description of each from a mathematical perspective followed by
JavaScript (ES6) code excerpts demonstrating how to apply theory to real world
scenarios.</p>
<!--more-->
<p><strong>NOTE</strong>: All code samples are written in ES6 and are therefore not likely to
execute directly in a browser. The best option is to use Node or transpile the
excerpts using either <a href="https://babeljs.io/">Babel</a> or
<a href="https://www.typescriptlang.org/">TypeScript</a>. The working code is available on
<a href="https://github.com/dalealleshouse/settheory">GitHub</a> along with execution
instructions.</p>
<h2 id="union">Union</h2>
<p>The union of two sets is a set containing the distinct elements from both sets.
<script type="math/tex">\cup</script> is the mathematical symbol for a union and the union of sets <script type="math/tex">A</script> and
<script type="math/tex">B</script> is denoted as <script type="math/tex">A \cup B</script> . An expanded way of representing the union
relationship is <script type="math/tex">\{x| x \in A \vee x \in B\}</script> , which means every element
contained in <script type="math/tex">A</script> OR (<script type="math/tex">∨</script>) <script type="math/tex">B</script>. Figure One – Union depicts two sets with
three elements each. The union is a set with five elements because one item,
three, is shared and union returns distinct values. The Venn diagram shows the
relationship graphically.</p>
<p><img src="/assets/images/set-theory-3/figure1.png" alt="Figure 1" class="img-center" /></p>
<p>Generating the union of two sets is quite easy in ES6 as the code below
illustrates.</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">A</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Set</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">]);</span>
<span class="kd">const</span> <span class="nx">B</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Set</span><span class="p">([</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">]);</span>
<span class="kd">const</span> <span class="nx">union</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Set</span><span class="p">([...</span><span class="nx">A</span><span class="p">,</span> <span class="p">...</span><span class="nx">B</span><span class="p">]);</span>
<span class="c1">// union = [1,2,3,4,5];</span>
</code></pre></div></div>
<p>The astute reader will notice that there’s some legerdemain afoot. The code
above uses the ES6 Set data structure instead of standard JavaScript arrays.
Set holds only unique elements by ignoring add operations for new values that
match existing ones. The algorithm is as easy as concatenating the two sets
without the concern of distinct elements. If the code was using standard
arrays, there would have to be logic to remove duplicated items. Luckily,
converting between sets and arrays is virtually effortless.</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">setDataStructure</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Set</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">]);</span>
<span class="kd">const</span> <span class="nx">arrayDataStrcture</span> <span class="o">=</span> <span class="nb">Array</span><span class="p">.</span><span class="k">from</span><span class="p">(</span><span class="nx">setDataStructure</span><span class="p">);</span>
</code></pre></div></div>
<p>The problem with the code above is that it’s a rare requirement to union sets
containing primitive values. Software engineering is seldom that
straightforward. A more realistic scenario is calculating the union between two
sets of complex objects where equality becomes problematic. Unlike primitive
variables, objects with identical values are not equal because they compare by
reference. This abrogates the Set trick from earlier. Suppose the requirement
is to compute all bug reports currently in process across two teams and it’s
possible that both teams are working on the same bugs simultaneously. The code
below demonstrates a solution by first concatenating the two sets and then
removing duplicates using the filter method introduced in the last article.
Notice the only equality check is via the Id. Obviously, this won’t work for
every scenario and depending on the size of the sets and performance
requirements it is possible to write generic deep equality methods (or use a
library like underscore).</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">teamABugs</span> <span class="o">=</span> <span class="p">[</span>
<span class="p">{</span> <span class="na">id</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Screen Explodes</span><span class="dl">"</span> <span class="p">},</span>
<span class="p">{</span> <span class="na">id</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span> <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Keyboard Burts into Flames</span><span class="dl">"</span> <span class="p">},</span>
<span class="p">{</span> <span class="na">id</span><span class="p">:</span> <span class="mi">3</span><span class="p">,</span> <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Submit button off by 1 pixel</span><span class="dl">"</span> <span class="p">}];</span>
<span class="kd">const</span> <span class="nx">teamBBugs</span> <span class="o">=</span> <span class="p">[</span>
<span class="p">{</span> <span class="na">id</span><span class="p">:</span> <span class="mi">5</span><span class="p">,</span> <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Randomly Dials Russian Hackers</span><span class="dl">"</span> <span class="p">},</span>
<span class="p">{</span> <span class="na">id</span><span class="p">:</span> <span class="mi">6</span><span class="p">,</span> <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Publishes CC info to the www</span><span class="dl">"</span> <span class="p">},</span>
<span class="p">{</span> <span class="na">id</span><span class="p">:</span> <span class="mi">3</span><span class="p">,</span> <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Submit button off by 1 pixel</span><span class="dl">"</span> <span class="p">}];</span>
<span class="kd">const</span> <span class="nx">union</span> <span class="o">=</span> <span class="p">[...</span><span class="nx">teamABugs</span><span class="p">,</span> <span class="p">...</span><span class="nx">teamBBugs</span><span class="p">]</span>
<span class="p">.</span><span class="nx">filter</span><span class="p">((</span><span class="nx">x</span><span class="p">,</span> <span class="nx">index</span><span class="p">,</span> <span class="nx">array</span><span class="p">)</span> <span class="o">=></span> <span class="nx">array</span><span class="p">.</span><span class="nx">findIndex</span><span class="p">(</span><span class="nx">y</span> <span class="o">=></span> <span class="nx">y</span><span class="p">.</span><span class="nx">id</span> <span class="o">==</span> <span class="nx">x</span><span class="p">.</span><span class="nx">id</span><span class="p">)</span> <span class="o">==</span> <span class="nx">index</span><span class="p">);</span>
</code></pre></div></div>
<h2 id="intersection">Intersection</h2>
<p>The intersection of two sets is a set containing distinct shared elements. <script type="math/tex">A
\cap B</script> is the mathematical representation of a union and the expanded
notation is <script type="math/tex">\{x|x \in A \wedge x \in B \}</script>. Stated differently, the
intersection of set <script type="math/tex">A</script> AND (<script type="math/tex">\wedge</script>) <script type="math/tex">B</script> is every element contained in
<script type="math/tex">A</script> AND <script type="math/tex">B</script>. Figure Two – Intersection depicts the relationship showing the
union of <script type="math/tex">A</script> and <script type="math/tex">B</script> to be a singleton set containing only the number
three. Once again, the Venn diagram portrays the relationship.</p>
<p><img src="/assets/images/set-theory-3/figure2.png" alt="Figure 2" class="img-center" /></p>
<p>Much like union, finding the intersection of two sets using the Set data
structure and primitive types is easy. The code below shows how it’s a matter
of using the filter method to check to see if an item is also stored in the
other set.</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">A</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Set</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">]);</span>
<span class="kd">const</span> <span class="nx">B</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Set</span><span class="p">([</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">]);</span>
<span class="kd">const</span> <span class="nx">intersect</span> <span class="o">=</span> <span class="p">[...</span><span class="nx">A</span><span class="p">].</span><span class="nx">filter</span><span class="p">(</span><span class="nx">x</span> <span class="o">=></span> <span class="nx">B</span><span class="p">.</span><span class="nx">has</span><span class="p">(</span><span class="nx">x</span><span class="p">));</span>
<span class="c1">// intersect = [3];</span>
</code></pre></div></div>
<p>The code above is a bit fanciful. Consider instead a role protected resource.
Possessing any one of many roles allows users to access said resource. Users
each have a set of associated roles. There are a few different ways to achieve
this, but finding the intersection between the user’s roles and the resource’s
required roles is the most manageable. See the code below.</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">resourceRoles</span> <span class="o">=</span> <span class="p">[</span>
<span class="p">{</span> <span class="na">id</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Administrator</span><span class="dl">"</span> <span class="p">},</span>
<span class="p">{</span> <span class="na">id</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span> <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Super User</span><span class="dl">"</span> <span class="p">}];</span>
<span class="kd">const</span> <span class="nx">user</span> <span class="o">=</span> <span class="p">{</span> <span class="na">id</span><span class="p">:</span> <span class="mi">314</span><span class="p">,</span> <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Edsger Dijkstra</span><span class="dl">"</span><span class="p">,</span> <span class="na">roles</span><span class="p">:</span> <span class="p">[</span>
<span class="p">{</span> <span class="na">id</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Administrator</span><span class="dl">"</span> <span class="p">},</span>
<span class="p">{</span> <span class="na">id</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span> <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">User</span><span class="dl">"</span> <span class="p">}]</span> <span class="p">}</span>
<span class="kd">const</span> <span class="nx">hasAccess</span> <span class="o">=</span> <span class="nx">resourceRoles</span>
<span class="p">.</span><span class="nx">filter</span><span class="p">(</span><span class="nx">x</span> <span class="o">=></span> <span class="nx">user</span><span class="p">.</span><span class="nx">roles</span><span class="p">.</span><span class="nx">find</span><span class="p">(</span><span class="nx">y</span> <span class="o">=></span> <span class="nx">y</span><span class="p">.</span><span class="nx">name</span> <span class="o">==</span> <span class="nx">x</span><span class="p">.</span><span class="nx">name</span><span class="p">)).</span><span class="nx">length</span> <span class="o">></span> <span class="mi">0</span><span class="p">;</span>
</code></pre></div></div>
<p>All of the caveats about equality described in the Union section also apply
here. It’s something programmers need to be cognizant of.</p>
<h2 id="difference">Difference</h2>
<p>The difference of two sets is sometimes known as the relative complement; both
nomenclatures are interchangeable. The concept is simple, the difference is a
set made up of the items that are left over after removing the intersection of
another set. Otherwise stated, all of the items in set <script type="math/tex">B</script> that do not exist
in set <script type="math/tex">A</script>. Mathematically, this is represented as <script type="math/tex">\{x|x \in B \wedge x
\notin A\}</script> or the shortened version which is <script type="math/tex">B \setminus A</script>. Figure Three –
Difference shows the difference between <script type="math/tex">B</script> and <script type="math/tex">A</script> to be a set containing
four and five. Just as above, there is a representative Venn diagram.</p>
<p><img src="/assets/images/set-theory-3/figure3.png" alt="Figure 3" class="img-center" /></p>
<p>As an aside, there is also an absolute compliment which is somewhat similar;
however, it is outside the scope of this article.</p>
<p>Finding the difference of sets is almost identical to finding the intersection
as the code below demonstrates. The only variation is that the predicate passed
to the filter method is negated.</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">A</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Set</span><span class="p">([</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">]);</span>
<span class="kd">const</span> <span class="nx">B</span> <span class="o">=</span> <span class="k">new</span> <span class="nb">Set</span><span class="p">([</span><span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">]);</span>
<span class="kd">const</span> <span class="nx">difference</span> <span class="o">=</span> <span class="p">[...</span><span class="nx">B</span><span class="p">].</span><span class="nx">filter</span><span class="p">(</span><span class="nx">x</span> <span class="o">=></span> <span class="o">!</span><span class="nx">A</span><span class="p">.</span><span class="nx">has</span><span class="p">(</span><span class="nx">x</span><span class="p">));</span>
<span class="c1">// difference = [4,5];</span>
</code></pre></div></div>
<p>Again, a more realistic example is in order. Image that there is a set of
actions that must be completed and a set of actions a user has completed.
Finding the difference is an easy way to determine if all required actions are
complete.</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">requiredActions</span> <span class="o">=</span> <span class="p">[</span>
<span class="p">{</span> <span class="na">id</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Electronic Signing</span><span class="dl">"</span> <span class="p">},</span>
<span class="p">{</span> <span class="na">id</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span> <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Submission Form</span><span class="dl">"</span> <span class="p">},</span>
<span class="p">{</span> <span class="na">id</span><span class="p">:</span> <span class="mi">3</span><span class="p">,</span> <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Payment</span><span class="dl">"</span> <span class="p">}];</span>
<span class="kd">const</span> <span class="nx">userActions</span> <span class="o">=</span> <span class="p">[</span>
<span class="p">{</span> <span class="na">id</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Electronic Signing</span><span class="dl">"</span> <span class="p">},</span>
<span class="p">{</span> <span class="na">id</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span> <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Submission Form</span><span class="dl">"</span> <span class="p">}];</span>
<span class="kd">const</span> <span class="nx">complete</span> <span class="o">=</span> <span class="nx">requiredActions</span>
<span class="p">.</span><span class="nx">filter</span><span class="p">(</span><span class="nx">x</span> <span class="o">=></span> <span class="o">!</span><span class="nx">userActions</span><span class="p">.</span><span class="nx">find</span><span class="p">(</span><span class="nx">y</span> <span class="o">=></span> <span class="nx">y</span><span class="p">.</span><span class="nx">name</span> <span class="o">==</span> <span class="nx">x</span><span class="p">.</span><span class="nx">name</span><span class="p">)).</span><span class="nx">length</span> <span class="o">===</span> <span class="mi">0</span><span class="p">;</span>
<span class="c1">// complete = false</span>
</code></pre></div></div>
<h2 id="cartesian-product">Cartesian Product</h2>
<p>The Cartesian product of two sets is a set of ordered pairs that contain all
possible combinations of elements in the two sets. The mathematical
representation is <script type="math/tex">A \times B</script>. The expanded notation is <script type="math/tex">\{(a,b)|a \in A
\wedge b \in B\}</script> which means an ordered pair consisting of every element in
<script type="math/tex">A</script> AND (<script type="math/tex">\wedge</script>) every element in <script type="math/tex">B</script>. Figure Four – Cartesian Product
demonstrates the concept. As a matter of importance, unlike standard products,
the Cartesian product is not commutative. Stated mathematically, <script type="math/tex">A \times B
\ne B \times A</script>. Switching the order of statement will change the order of the
pairs.</p>
<p><img src="/assets/images/set-theory-3/figure4.png" alt="Figure 4" class="img-center" /></p>
<p>The Cartesian product is useful for combinatorics problems. A common example is
simulating a deck of cards. Instead of specifying all the cards explicitly in
code, it’s easier to define the suits and values as two separate sets and then
take the Cartesian product to get the entire deck. See the code below.</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">suits</span> <span class="o">=</span> <span class="p">[</span><span class="dl">'</span><span class="s1">Diamond</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">Spade</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">Heart</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">Club</span><span class="dl">'</span><span class="p">];</span>
<span class="kd">const</span> <span class="nx">values</span> <span class="o">=</span> <span class="p">[</span><span class="dl">'</span><span class="s1">Ace</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">2</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">3</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">4</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">5</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">6</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">7</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">8</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">9</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">10</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">Jack</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">Queen</span><span class="dl">'</span><span class="p">,</span> <span class="dl">'</span><span class="s1">King</span><span class="dl">'</span><span class="p">];</span>
<span class="kd">const</span> <span class="nx">cards</span> <span class="o">=</span> <span class="nx">suits</span><span class="p">.</span><span class="nx">reduce</span><span class="p">((</span><span class="nx">acc</span><span class="p">,</span> <span class="nx">x</span><span class="p">)</span> <span class="o">=></span> <span class="p">[...</span><span class="nx">acc</span><span class="p">,</span> <span class="p">...</span><span class="nx">values</span><span class="p">.</span><span class="nx">map</span><span class="p">(</span><span class="nx">y</span> <span class="o">=></span> <span class="p">[</span><span class="nx">x</span><span class="p">,</span> <span class="nx">y</span><span class="p">])],</span> <span class="p">[]);</span>
<span class="c1">// Alternatively, it’s possible to return the ordered pair as an object instead of an array</span>
<span class="c1">// const cards = suits.reduce((acc, x) => [...acc, ...values.map(y => { return { suit: x, value: y } })], []);</span>
</code></pre></div></div>
<p>This code should be starting to look familiar because all the samples make
heavy use of the map, reduce, and filter methods. Using ES6, these methods have
great utility for mimicking mathematical set operations. Because the code above
is similar to previous examples, it doesn’t require further explanation.</p>
<h2 id="why-stop-at-two">Why Stop at Two?</h2>
<p>Up to this point, all the exhibited set operations employ two sets. However,
this is for the sake of brevity. Each operations can act on as many sets as
required. For instance, <script type="math/tex">A \cup B \cup C</script> is perfectly valid as is <script type="math/tex">A \times
B \times C</script>. The enthused reader should solidify his/her learning by expanding
each code sample to use additional sets.</p>
<h2 id="real-world-applications">Real World Applications</h2>
<p>This series demonstrated how set theory is applied to data structures and
demonstrated some novel uses for set operations in order create efficient
algorithms. However, this is only a meager representation of all the many and
varied applications for software engineering. Relational databases make heavy
use of set theory for defining data structure and constructing data queries. In
fact, SQL is essentially a set notation. There are several instances in
language theory and design where strings are realized as sets and set
operations are performed on them. Another prolific use is in computer graphics
where points on a plane are treated as sets. The list of applications is
considerable. It’s a body of knowledge that no software professional should
forsake.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Thus concludes this three-part series on set theory. Hopefully, the reader has
gained a high-level understanding as well as enough practical knowledge to
apply the learning forthwith. The first article outlined the basics and
introduced the concept of set mapping. Empty sets, cardinality, subsets,
summation, and power sets were introduced in the second piece. Finally, this
post presented operations involving more than one set including unions,
intersections, differences, and Cartesian products. The method was to first
introduce the ideas mathematically and then demonstrate how to apply them using
ES6. These concepts should not be considered optional for software
professionals because set theory is ubiquitous in computer science.</p>
<p>As always, thank you for reading and please feel free to contact me with
questions. I’m also happy to create more in depth posts upon request.</p>Dale Alleshousedale@alleshouse.netWelcome to the final installment of this three-part series on set theory. The first piece, Set Theory Defined, detailed requisite foundational knowledge. The second article, Set Operations, outlined some beneficial set algorithms. This post develops the concepts laid out in the first two; therefore, it is highly recommended that readers begin there. Individual sets have many useful properties; however, preforming operations on multiple sets provides even greater utility. This piece outlines four such operations. Each operation provides a concise means for addressing common programming problems that virtually all software professionals encounter. There is a brief description of each from a mathematical perspective followed by JavaScript (ES6) code excerpts demonstrating how to apply theory to real world scenarios.Just Enough Set Theory - Set Operations (Part 2 of 3)2017-02-19T00:00:00+00:002017-02-19T00:00:00+00:00https://www.hideoushumpbackfreak.com/2017/02/19/Set-Operations<p>Welcome to the second installment of this three-part series on set theory. The
first piece, <a href="/2017/02/05/Set-Theory-Defined.html">Set Theory Defined</a>
(recently updated with code samples), detailed requisite foundational knowledge.
It is highly recommended that readers begin there if they haven’t already.</p>
<p>The first piece in this series introduced sets and exhibited how ES6 arrays are
analogous to them. It also depicted how to transform, or map, a set into a
related set. This post expands on set theory by probing into set operations.</p>
<!--more-->
<p><strong>NOTE</strong>: All code samples are written in ES6 and are therefore not likely to
execute directly in a browser. The best option is to use Node or transpile the
excerpts using either <a href="https://babeljs.io/">Babel</a> or
<a href="https://www.typescriptlang.org/">TypeScript</a>. The working code is available on
<a href="https://github.com/dalealleshouse/settheory">GitHub</a> along with execution
instructions.</p>
<h2 id="empty-sets">Empty Sets</h2>
<p>Empty sets are a rather mundane topic, but nonetheless worth mentioning. As the
name implies, they are simply sets that have no elements. They are also commonly
referred to as null sets. Mathematically, empty sets are represented as either
<script type="math/tex">\emptyset</script> or <script type="math/tex">\{\}</script>. The concept relates to empty arrays in software.</p>
<h2 id="cardinality">Cardinality</h2>
<p>The term cardinality sounds impressive; however, it’s simply the number of
elements in a set. The mathematical representation of a set with three elements
is as depicted in Figure One – Cardinality.</p>
<p><img src="/assets/images/set-theory-2/figure1.png" alt="Figure 1" class="img-center" /></p>
<p>In JavaScript, the cardinality of an array is its length. See the code below.</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">someSet</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">];</span>
<span class="kd">const</span> <span class="nx">cardinality</span> <span class="o">=</span> <span class="nx">someSet</span><span class="p">.</span><span class="nx">length</span><span class="p">;</span>
<span class="c1">// cardinality = 5</span>
</code></pre></div></div>
<h2 id="subsets">Subsets</h2>
<p>Subsets are relatively easy to explain, yet have far reaching implications. A
subset is a portion of a larger set. For instance, consider the set of all
animals (<script type="math/tex">A</script>). The set of all dogs (<script type="math/tex">D</script>) is a subset of the animal set
because although every animal is not a dog, every dog is an animal. The
mathematical notation for subsets is as follows: <script type="math/tex">D \subseteq A</script>. Another way
of mathematically expressing the subset relationship is <script type="math/tex">\forall x(x\in D
\implies x\in A)</script>. That looks absurd, but the premise is that for any
(<script type="math/tex">\forall</script>) element (<script type="math/tex">x</script>) in <script type="math/tex">D</script>, it is implied (<script type="math/tex">\implies</script>) that the
element (<script type="math/tex">x</script>) also exists in <script type="math/tex">A</script>.</p>
<p><img src="/assets/images/set-theory-2/figure2.png" alt="Figure 2" class="img-center" /></p>
<p>Subsets are often taught with Venn Diagrams. See Figure Three – Venn Diagrams
for an example. Admittedly, this account of subsets is a bit prosaic. However,
the final post in this series relies heavily on the concept so it bears
belaboring the point.</p>
<p><img src="/assets/images/set-theory-2/figure3.png" alt="Figure 3" class="img-center" /></p>
<p>ES6 has a built-in filter method on the array object that enables easy access to
subsets. Filter takes a predicate as an argument. Recall from the first article
that a predicate is a function that takes a single argument and returns a
Boolean response. The filter method applies the predicate to each item in a set
and creates a new set that includes the items where the predicate returned true.
See the code below.</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">animals</span> <span class="o">=</span> <span class="p">[</span>
<span class="p">{</span><span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Tom</span><span class="dl">"</span><span class="p">,</span> <span class="na">type</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Cat</span><span class="dl">"</span><span class="p">},</span>
<span class="p">{</span><span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Jerry</span><span class="dl">"</span><span class="p">,</span> <span class="na">type</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Mouse</span><span class="dl">"</span><span class="p">},</span>
<span class="p">{</span><span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Pluto</span><span class="dl">"</span><span class="p">,</span> <span class="na">type</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Dog</span><span class="dl">"</span><span class="p">},</span>
<span class="p">{</span><span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Scooby Doo</span><span class="dl">"</span><span class="p">,</span> <span class="na">type</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Dog</span><span class="dl">"</span><span class="p">}];</span>
<span class="kd">const</span> <span class="nx">dogs</span> <span class="o">=</span> <span class="nx">animals</span><span class="p">.</span><span class="nx">filter</span><span class="p">(</span><span class="nx">a</span> <span class="o">=></span> <span class="nx">a</span><span class="p">.</span><span class="nx">type</span> <span class="o">==</span> <span class="dl">"</span><span class="s2">Dog</span><span class="dl">"</span><span class="p">);</span>
<span class="c1">// dogs = [{name: "Pluto", type: "Dog"}, {name: "Scooby Doo", type: "Dog"}]</span>
</code></pre></div></div>
<h2 id="summation">Summation</h2>
<p>The term summation is a bit misleading because it implies simply adding elements
together, however it’s a more powerful concept. Summation applies a function to
each element of a set reducing it to a single value. <script type="math/tex">\sum_{x \in S}f(x)</script> is
the mathematical notation representing the algorithm where <script type="math/tex">S</script> can be any set
and <script type="math/tex">f(x)</script> can be any function. Consider Figure Four – Summation. Given the
set <script type="math/tex">A</script> , each element in the set is multiplied by two and added together.</p>
<p><img src="/assets/images/set-theory-2/figure4.png" alt="Figure 4" class="img-center" /></p>
<p>ES6’s reduce method of the array object is comparable to summation. Aptly named,
reduce applies a function to each member of a set reducing it to a single value.
It accepts two arguments: a function and an optional starting value. The
function accepts an accumulated value and the current item. The state of the
accumulated value after all items are processed is the final return value. The
code below is the same process detailed in Figure Four – Summation.</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">someSet</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">];</span>
<span class="kd">const</span> <span class="nx">sum</span> <span class="o">=</span> <span class="nx">someSet</span><span class="p">.</span><span class="nx">reduce</span><span class="p">((</span><span class="nx">acc</span><span class="p">,</span> <span class="nx">x</span><span class="p">)</span> <span class="o">=></span> <span class="nx">acc</span> <span class="o">+</span> <span class="nx">x</span> <span class="o">*</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
<span class="c1">// sum = 12</span>
</code></pre></div></div>
<p>Reduce is useful for many operations beyond mathematical functions. The code
below utilizes it to extract email addresses from a set of users.</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">users</span> <span class="o">=</span> <span class="p">[</span>
<span class="p">{</span><span class="na">id</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="na">email</span><span class="p">:</span> <span class="dl">"</span><span class="s2">email@email.com</span><span class="dl">"</span><span class="p">},</span>
<span class="p">{</span><span class="na">id</span><span class="p">:</span> <span class="mi">2</span><span class="p">,</span> <span class="na">email</span><span class="p">:</span> <span class="dl">"</span><span class="s2">email2@email2.com</span><span class="dl">"</span><span class="p">},</span>
<span class="p">{</span><span class="na">id</span><span class="p">:</span> <span class="mi">3</span><span class="p">,</span> <span class="na">email</span><span class="p">:</span> <span class="dl">"</span><span class="s2">email3@email.com</span><span class="dl">"</span><span class="p">}];</span>
<span class="kd">const</span> <span class="nx">emails</span> <span class="o">=</span> <span class="nx">users</span><span class="p">.</span><span class="nx">map</span><span class="p">(</span><span class="nx">u</span> <span class="o">=></span> <span class="nx">u</span><span class="p">.</span><span class="nx">email</span><span class="p">).</span><span class="nx">reduce</span><span class="p">((</span><span class="nx">acc</span><span class="p">,</span> <span class="nx">x</span><span class="p">)</span> <span class="o">=></span> <span class="s2">`</span><span class="p">${</span><span class="nx">acc</span><span class="p">}</span><span class="s2">;</span><span class="p">${</span><span class="nx">x</span><span class="p">}</span><span class="s2">`</span><span class="p">);</span>
<span class="c1">// emails = "email@email.com;email2@email2.com;email3@email.com"</span>
</code></pre></div></div>
<p>This above doesn’t do the reduce method proper justice because its efficacy is
virtually endless. There are many more options that are outside the scope of
this feature. The reader is highly encouraged to find more information on
Mozilla’s excellent JavaScript reference.</p>
<h2 id="power-set">Power Set</h2>
<p>Power sets are something every programmer has to deal with at some point in
his/her career, even if they can’t formally identify them by name. In
mathematical parlance, power sets are denoted as <script type="math/tex">P(A)</script>. A power set is the
set of all subsets including the empty set and itself: more succinctly, all
possible set combinations. A power set always contains <script type="math/tex">2^n</script> elements where
<script type="math/tex">n</script> is the cardinality of the original set (<script type="math/tex">|P(A)|=2^{|A|}</script>).</p>
<p>Power sets are difficult to conceptualize without an example. Figure Five –
Power Set depicts a set with three elements. The power set is all possible
combinations of the three elements. The result is a set with a cardinality of
eight (<script type="math/tex">2^3</script>).</p>
<p><img src="/assets/images/set-theory-2/figure5.png" alt="Figure 5" class="img-center" /></p>
<p>Unfortunately, there isn’t an innate JavaScript method for creating power sets.
However, that’s an easy problem to overcome given some ingenuity. See the code
below.</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">someSet</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">];</span>
<span class="kd">const</span> <span class="nx">powerSet</span> <span class="o">=</span> <span class="nx">someSet</span><span class="p">.</span><span class="nx">reduce</span><span class="p">((</span><span class="nx">acc</span><span class="p">,</span> <span class="nx">x</span><span class="p">)</span> <span class="o">=></span> <span class="p">[...</span><span class="nx">acc</span><span class="p">,</span> <span class="p">...</span><span class="nx">acc</span><span class="p">.</span><span class="nx">map</span><span class="p">(</span><span class="nx">y</span> <span class="o">=></span> <span class="p">[</span><span class="nx">x</span><span class="p">,</span> <span class="p">...</span><span class="nx">y</span><span class="p">])],</span> <span class="p">[[]]);</span>
<span class="c1">// powerSet = [[], [0], [1], [1,0], [2], [2,0], [2,1], [2,1,0]]</span>
</code></pre></div></div>
<p>The code above is a bit intimidating at first glance so it merits additional
explanation. The power set always contains an empty set, so the second argument
to the reduce method is a set that contains nothing but that. This is the
starting value. When the function acts on the first item in the set, the value
of <code class="language-plaintext highlighter-rouge">acc</code> is <code class="language-plaintext highlighter-rouge">[[]]</code> and the value of <code class="language-plaintext highlighter-rouge">x</code> is <code class="language-plaintext highlighter-rouge">0</code>. The result of concatenating the
current item to each item in <code class="language-plaintext highlighter-rouge">acc</code> is concatenated on to the value of <code class="language-plaintext highlighter-rouge">acc</code>
making it <code class="language-plaintext highlighter-rouge">[[], [0]]</code>. The same algorithm is applied to each item in the set.
This is difficult to envisage, so the code below details essentially what
happens upon invocation.</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">ps</span> <span class="o">=</span> <span class="p">(</span><span class="nx">acc</span><span class="p">,</span> <span class="nx">x</span><span class="p">)</span> <span class="o">=></span> <span class="p">[...</span><span class="nx">acc</span><span class="p">,</span> <span class="p">...</span><span class="nx">acc</span><span class="p">.</span><span class="nx">map</span><span class="p">(</span><span class="nx">y</span> <span class="o">=></span> <span class="p">[</span><span class="nx">x</span><span class="p">,</span> <span class="p">...</span><span class="nx">y</span><span class="p">])];</span>
<span class="c1">// First element</span>
<span class="kd">let</span> <span class="nx">acc</span> <span class="o">=</span> <span class="nx">ps</span><span class="p">([[]],</span> <span class="mi">0</span><span class="p">);</span>
<span class="c1">// acc = [[], [0]]</span>
<span class="c1">// Second element</span>
<span class="nx">acc</span> <span class="o">=</span> <span class="nx">ps</span><span class="p">(</span><span class="nx">acc</span><span class="p">,</span> <span class="mi">1</span><span class="p">);</span>
<span class="c1">// acc = [[], [0], [1], [1,0]]</span>
<span class="c1">// Third element</span>
<span class="nx">acc</span> <span class="o">=</span> <span class="nx">ps</span><span class="p">(</span><span class="nx">acc</span><span class="p">,</span> <span class="mi">2</span><span class="p">);</span>
<span class="c1">// acc = [[], [0], [1], [1, 0], [2], [2, 0], [2, 1], [2, 1, 0]]</span>
</code></pre></div></div>
<h2 id="conclusion">Conclusion</h2>
<p>The post outlined a few useful set operations. ES6 uses the reduce method to
apply the concept of summation to sets. A power set is a set of all possible set
combinations. Although there is no built in ES6 functionality for this, it’s an
easy algorithm to create. Make sure to come back for the final post entitled
When Sets Collide. It is by far the most useful in the series covering set
operations that act on multiple individual sets.</p>Dale Alleshousedale@alleshouse.netWelcome to the second installment of this three-part series on set theory. The first piece, Set Theory Defined (recently updated with code samples), detailed requisite foundational knowledge. It is highly recommended that readers begin there if they haven’t already. The first piece in this series introduced sets and exhibited how ES6 arrays are analogous to them. It also depicted how to transform, or map, a set into a related set. This post expands on set theory by probing into set operations.Just Enough Set Theory - Set Theory Defined (Part 1 of 3)2017-02-05T00:00:00+00:002017-02-05T00:00:00+00:00https://www.hideoushumpbackfreak.com/2017/02/05/Set-Theory-Defined<p>Set theory is incredibly intuitive and has many practical applications in
software engineering. In fact, any professional programmer without an
understanding is at a disadvantage. Unfortunately, many in the industry relegate
it to the purview of mathematicians. This is understandable because most
material on the subject delineates set theory with first order logic as a basis
for math. The good news is that it doesn’t have to be this way. As this series
demonstrates, it is accessible to anyone regardless of background.</p>
<p>The three articles in this series aim to introduce set theory, expound upon set
operations, and demonstrate the learning using JavaScript (ES6). The goal is to
provide the reader with actionable knowledge to improve his/her software skills
without a surfeit of superfluous details. This first installment describes the
theory in order to provide a firm foundation for future practical application.</p>
<!--more-->
<p><strong>NOTE</strong>: All code samples are written in ES6 and are therefore not likely to
execute directly in a browser. The best option is to use Node or transpile the
excerpts using either <a href="https://babeljs.io/">Babel</a> or
<a href="https://www.typescriptlang.org/">TypeScript</a>. The working code is available on
<a href="https://github.com/dalealleshouse/settheory">GitHub</a> along with execution
instructions.</p>
<h2 id="what-is-set-theory">What is Set Theory</h2>
<p>The inception of set theory dates back to the nineteenth century with Georg
Cantor. On the surface, it’s brilliantly simple. A set is simply a collection of
unordered objects. In mathematical parlance, objects contained in a set are
known as members or elements. An element can be literally anything, including
another set. Sets are typically depicted as objects inside curly braces and are
denoted by capital letters. For instance, <script type="math/tex">A={1,2,3}</script> is the mathematical
representation of the set <script type="math/tex">A</script> with the members <script type="math/tex">1</script>, <script type="math/tex">2</script>, and <script type="math/tex">3</script>. Set
membership is signified as: <script type="math/tex">1 \in A</script>. Figure One – Sets illustrates these
symbols.</p>
<p><img src="/assets/images/set-theory-1/figure1.png" alt="Figure 1" class="img-center" /></p>
<p>Set theory relies on FOPL (First Order Predicate Logic) to construct sets.
Expanding on the definition above, sets are a collection of objects that satisfy
a predicate. A predicate is a function that accepts a single argument and
returns a Boolean (true or false) value. For instance, the set of all dogs has
the predicate <script type="math/tex">IsDog(n)</script>. In other words, elements of a set share some
arbitrary property. FOPL is fascinating, but not particularly relevant to this
article. A general acumen of predicates is sufficient for comprehension of this
material. A cursory web search for First Order Logic will present sufficient
resources for the curious reader.</p>
<h2 id="set-mapping">Set Mapping</h2>
<p>There are a few interesting operations that can be performed on sets, most of
which are covered in the next installment. However, mapping from one set to
another is germane to a foundational understanding of set theory. A set is
transformed, or mapped, into another related set via the use of a function.</p>
<p>A mathematical function is analogous to a software function with added
constraints. They are similar in that they accept an input and return an output.
The difference is that a mathematical function can only accept a single input,
must return an output, are determinate, and side effects are impermissible.
Sources often refer to functions as relations between sets because they map a
member of a set to member of another set. While mathematical functions are
relevant to the understanding of set theory, programmers need not be
particularly concerned with this concept. The significant notion is that of a
function in general, which should be apparent to most software professionals. As
an aside, further understanding of mathematical functions is particularly useful
for other programming concepts.</p>
<p>Mapping works by applying a function to each member of a set and placing the
output into another set. Figure Two – Set Mapping illustrates the concept. This
is particularly applicable to programming, so understanding is imperative.</p>
<p><img src="/assets/images/set-theory-1/figure2.png" alt="Figure 2" class="img-center" /></p>
<p>Given the information above, the impetus of the map method of arrays in
JavaScript (ES6) is obvious. Arrays are a convenient analog to sets. See the
code sample below.</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">wholeNumbers</span> <span class="o">=</span> <span class="p">[</span><span class="mi">1</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="mi">3</span><span class="p">];</span>
<span class="kd">const</span> <span class="nx">evenNumbers</span> <span class="o">=</span> <span class="nx">wholeNumbers</span><span class="p">.</span><span class="nx">map</span><span class="p">(</span><span class="nx">n</span> <span class="o">=></span> <span class="nx">n</span> <span class="o">*</span> <span class="mi">2</span><span class="p">);</span>
<span class="c1">// evenNumbers = [2, 4, 6]</span>
</code></pre></div></div>
<p>The above isn’t exactly a realistic scenario: generating an array of doubled
numbers isn’t auspicious. A more real world use of the map method is to modify
complex objects. See the code below.</p>
<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">people</span> <span class="o">=</span> <span class="p">[{</span><span class="na">id</span><span class="p">:</span> <span class="mi">1</span><span class="p">,</span> <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Ada Lovelace</span><span class="dl">"</span><span class="p">},</span> <span class="p">{</span><span class="na">id</span><span class="p">:</span><span class="mi">2</span><span class="p">,</span> <span class="na">name</span><span class="p">:</span> <span class="dl">"</span><span class="s2">Charles Babbage</span><span class="dl">"</span><span class="p">}];</span>
<span class="kd">const</span> <span class="nx">names</span> <span class="o">=</span> <span class="nx">people</span><span class="p">.</span><span class="nx">map</span><span class="p">(</span><span class="nx">p</span> <span class="o">=></span> <span class="nx">p</span><span class="p">.</span><span class="nx">name</span><span class="p">);</span>
<span class="c1">// names = ["Ada Lovelace", "Charles Babbage"]</span>
</code></pre></div></div>
<p>Map is exceedingly suitable for many use cases. Understanding set theory
elucidates its utility.</p>
<h2 id="warning">Warning</h2>
<p>As a fair warning, the remainder of this post provides a prospectus of the areas
of set theory that aren’t directly applicable to everyday programming
activities. Although intriguing, the uninterested reader should feel free to
skip to the conclusion.</p>
<h2 id="to-infinity-and-beyond">To Infinity and Beyond</h2>
<p>The conception of sets isn’t exactly revolutionary. Kindergarten pedagogy
teaches children to categorize objects into sets. It’s simple and intuitive. The
innovation is revealed by examining sets of infinite size.</p>
<p>Conceptually, there are two methods for comparing the sizes of sets. The first
is to enumerate the members and compare the resulting counts. This is blindingly
obvious; however, it has a substantial flaw. It isn’t possible to calculate the
number of members in an infinite set. As a second option, Cantor postulated that
if it is possible to create a function that maps the first set to the second set
without skipping members, then the sets must be of equal size.</p>
<p>The canonical example is to compare the set of natural numbers (whole numbers
excluding zero) to the set of even natural numbers. Figure Three – Counting Sets
demonstrates the concept. Although it’s not exactly intuitive, and is often
controversial, this establishes that the two infinite sets are equally sized.
This might lead one to believe infinity is simply infinity. However, it’s a bit
more abstruse.</p>
<p><img src="/assets/images/set-theory-1/figure3.png" alt="Figure 3" class="img-center" /></p>
<p>Consider the set of real numbers (natural, rational, irrational, and
transcendental) between one and two. Think back to the number lines that are an
inexorable part of preparatory education and envision a set encompassing all
numbers on the line between one and two. Regardless of the placement of two
distinct points on the line, it is possible to find a smaller number between
them. The interesting thing about this infinite set is that it is not possible
to create a function that maps the set of natural numbers to this set without
skipping members. This implies that although both sets are infinite, the set of
real numbers between one and two is actually larger than the set of all natural
numbers. Cantor verified this in a beautifully elegant proof known as Cantors
Diagonalization.</p>
<p>While theoretically straightforward, the notion of multiple sizes of infinity is
a bit vexatious. John von Neumann once said, “in mathematics you don’t
understand things. You just get used to them.”. This concept holds true to his
conjecture. The good news is that the notion of different sizes of infinity is
only applicable in the most esoteric areas of computer science. The majority of
programmers need not concern themselves with it.</p>
<h2 id="dont-be-naïve">Don’t be Naïve</h2>
<p>Set theory took the mathematical world by storm with its simplicity and
elegance. Many foundational theories are built on the cornerstone of set theory.
However, it contains a substantial flaw which could have spelled doom except
that mathematicians couldn’t deny its utility. Therefore, it split into two
separate theories known as naïve and axiomatic set theory. It’s similar to how
general and special relativity exist simultaneously.</p>
<p>Naïve set theory is sufficient for many applications. In fact, it is adequate
for almost all software engineering use cases. Axiomatic set theory does apply
to some esoteric areas of computability and logic. However, it is far removed
from the greatest majority of programming tasks.</p>
<p>As for axiomatic set theory, it is an extension of the original theory that
introduces several axioms that address flaws. The underlying issue with naïve
set theory is that a paradox can arise when defining predicates. The most
popular demonstration of the defect is Russell’s Paradox. Succinctly stated:
does the set of all sets that do not include themselves include itself? If the
answer is yes, then the definition is contradictory because it does contain
itself. If the answer is no, then the predicate is likewise inconsistent because
it cannot contain all sets that do not contain themselves. Don’t worry if this
seems perplexing, it often requires reflection.</p>
<p>The finer points of axiomatic set theory are beyond the scope of this article.
However, the intrigued reader should perform a web search for Zermelo–Fraenkel
set theory to learn more. Regardless of its applicability to programming, it’s
quite captivating.</p>
<h2 id="conclusion">Conclusion</h2>
<p>The most pertinent programming related concepts detailed in this post are sets
and set mapping. A set is simply a collection of objects. Set mapping is
applying a function to each member of a set to produce a related set. The
following pieces in this series expound on how these concepts are applicable.</p>
<p>Set theory is surprisingly simple yet it reveals some mystifying truths such as
the fact that there are multiple sizes of infinity. There are essentially two
branches of set theory: naïve and axiomatic. Naïve set theory is sufficient for
the majority of software engineering applications.</p>
<p>Make sure to come back for the next article. With the foundational concepts out
of the way, the post delves into set operations which provide valuable mental
models for programmers. These are concepts that will improve your development
abilities.</p>Dale Alleshousedale@alleshouse.netSet theory is incredibly intuitive and has many practical applications in software engineering. In fact, any professional programmer without an understanding is at a disadvantage. Unfortunately, many in the industry relegate it to the purview of mathematicians. This is understandable because most material on the subject delineates set theory with first order logic as a basis for math. The good news is that it doesn’t have to be this way. As this series demonstrates, it is accessible to anyone regardless of background. The three articles in this series aim to introduce set theory, expound upon set operations, and demonstrate the learning using JavaScript (ES6). The goal is to provide the reader with actionable knowledge to improve his/her software skills without a surfeit of superfluous details. This first installment describes the theory in order to provide a firm foundation for future practical application.Coding Theory (Part 3 of 3) - Demonstration2016-11-11T00:00:00+00:002016-11-11T00:00:00+00:00https://www.hideoushumpbackfreak.com/2016/11/11/Coding-Theory-Demonstration<p>Welcome to the final installment of this three-part series on coding theory. If
you have not had the opportunity to read the first two pieces, it is highly
recommended that you do before continuing on. They are available here:</p>
<ul>
<li><a href="/2016/06/30/Coding-Theory-Defined.html">Coding Theory (Part 1 of 3) - Coding Theory Defined</a></li>
<li><a href="/2016/10/29/Perfect-Error-Correction.html">Coding Theory (Part 2 of 3) - Coding Theory Defined</a></li>
</ul>
<p>Having covered cogent concepts in previous posts, this article aims to dive into
a demonstration which consists of defining a code using a generator matrix and
correcting errors using a parity check matrix. The example is a bit contrived
and thoroughly simplified for the sake of brevity. However, the intent is not to
provide an exhaustive resource; it is to familiarize the reader with coding
theory and hopefully entice him/her into further inquiry.</p>
<!--more-->
<p>As a fair warning, this post contains a modest amount of high school/first year
college level math. An understanding of Boolean algebra (integer arithmetic
modulo two) and matrices are a welcomed asset to readers. However, learners less
accustomed to these concepts can still follow along and simply have faith that
the math works out as advertised. A cursory overview of relevant math concepts
is provided where appropriate.</p>
<h2 id="generator-matrix">Generator Matrix</h2>
<p>A generator matrix is a simple, yet particularly clever means of generating
codes. They are comprised of an identity matrix combined with an arbitrary
matrix. Multiplying a message in row matrix form by a generator matrix produces
a codeword. This is difficult concept to grasp without an example. Therefore,
the remainder of this section is step-by-step instructions for creating a
generator matrix that will produce a code with eight codewords.</p>
<p>The first step is to define an identity matrix which is a matrix that any given
matrix can be multiplied by without changing the value of the given matrix. This
is accomplished by setting the principal diagonal elements to one and leaving
the rest as zero. See figure one for an example. The matrix is of order three
because a three-digit binary string can represent eight possible values which is
the number of desired codewords.</p>
<p><img src="/assets/images/coding-theory-3/figure1.png" alt="Figure 1" class="img-center" /></p>
<p>The next step is to define an arbitrary matrix (denoted by <script type="math/tex">A</script>). The size of
the matrix determines the size of generated codewords. If <script type="math/tex">m</script> is the size of
the identity matrix, and <script type="math/tex">n</script> is the desired length of codewords, then the
arbitrary matrix should be of size <script type="math/tex">m−by−(n−m)</script>. Six digit codewords suffice
for the purposes of this article; therefore, the arbitrary matrix must be sized
three by three (six-digit length minus three-digit identity). Figure two is
<script type="math/tex">A</script> as used by the remaining examples.</p>
<p><img src="/assets/images/coding-theory-3/figure2.png" alt="Figure 2" class="img-center" /></p>
<p>The only thing left do is combine the two matrices above together to form <script type="math/tex">G</script>.
It’s as simple as placing them side by side as shown in figure three.</p>
<p><img src="/assets/images/coding-theory-3/figure3.png" alt="Figure 3" class="img-center" /></p>
<p>With the generator matrix (<script type="math/tex">G</script>) in hand, generating codewords is trivial.
Multiplying any three-digit binary message in row matrix form produces a
codeword. For example, the message <script type="math/tex">011</script> becomes the codeword <script type="math/tex">011110</script> as
shown in figure four. Notice the codeword is the original message with three
parity bits appended. This happens because the generator matrix begins with an
identity matrix.</p>
<p><img src="/assets/images/coding-theory-3/figure4.png" alt="Figure 4" class="img-center" /></p>
<h2 id="examining-the-code">Examining the Code</h2>
<p>The example code (<script type="math/tex">C</script>) is comprised of every number between <script type="math/tex">000</script> and
<script type="math/tex">111</script> multiplied by the generator matrix as shown in figure 5. The example
code has a couple of notable attributes. The first is that the sum of any two
codewords is yet another codeword. This is known as a linear code. Another
extraordinary characteristic is that the minimum hamming distance of the code is
equal to the minimum weight of the nonzero codewords. Weight is the number of
ones within a codeword. The reasons for this are beyond the scope of this post;
it is mentioned to seduce the reader into continued exploration. Examining the
code reveals that the minimum hamming distance is three (<script type="math/tex">d(C)=3</script>).</p>
<p><img src="/assets/images/coding-theory-3/figure5.png" alt="Figure 5" class="img-center" /></p>
<p>With the code in hand, it’s possible to calculate the equations outlined in part
two of this series. First, it’s pertinent to know how many errors the code is
capable of detecting and correcting. The previous paragraph defines the minimum
hamming distance as three. Figure six demonstrates that the example code is
capable of detecting a maximum of two errors and correcting a maximum of one.</p>
<p><img src="/assets/images/coding-theory-3/figure6.png" alt="Figure 6" class="img-center" /></p>
<p>Another relevant equation introduced in the second installment of this series is
the Hamming bound. Recall that the <script type="math/tex">|C|</script> denotes the upper bound number of
codewords, <script type="math/tex">n</script> is the length of the codewords, and <script type="math/tex">k</script> is the maximum number
of errors the code is capable of correcting. Figure seven demonstrates plugging
these variables into the Hamming bound equation.</p>
<p><img src="/assets/images/coding-theory-3/figure7.png" alt="Figure 7" class="img-center" /></p>
<p>The remainder of this post deals with detecting and correcting errors after
transmission. Parity check matrices, described in the next section, are a
counterpart to generator matrices that facilitate error detection and
correction.</p>
<h2 id="parity-check-matrices">Parity Check Matrices</h2>
<p>Parity check matrices are derived from generator matrices. They are used during
the decoding process to expose and correct errors introduced during
transmission. Multiplying a parity check matrix by the transpose of a codeword
exposes errors. The concept is best elucidated by demonstration.</p>
<p>A parity check matrix (denoted as <script type="math/tex">H</script>) is comprised of the transpose of the
arbitrary matrix combined with the identity matrix. As a refresher, the
transpose of a matrix is simply the matrix flipped across it’s diagonal so that
the <script type="math/tex">(i,j)</script><sup>th</sup> element in the matrix becomes the
<script type="math/tex">(j,i)</script><sup>th</sup> element. Figure 8 shows the parity check matrix that
corresponds to the generator matrix from the running example.</p>
<p><img src="/assets/images/coding-theory-3/figure8.png" alt="Figure 8" class="img-center" /></p>
<p>Multiplying the transpose of any valid codeword by the parity check matrix
produces a zero-value result as demonstrated in figure nine. The mathematical
rational for this this is beyond the scope of this post. However, it is a
worthwhile endeavor for the reader to research this further.</p>
<p><img src="/assets/images/coding-theory-3/figure9.png" alt="Figure 9" class="img-center" /></p>
<p>Changing any of the bits in the codeword produces a non-zero result which
indicates an error. Consider <script type="math/tex">011010</script>, as shown in figure ten. The result does
not equal zero so at least one of the bits is erroneous.</p>
<p><img src="/assets/images/coding-theory-3/figure10.png" alt="Figure 10" class="img-center" /></p>
<p>After identifying an inaccurate codeword, it may be possible to correct it using
<script type="math/tex">H</script>. Continuing with the example above; the product of the codeword and <script type="math/tex">H</script>
is equal to the forth column in <script type="math/tex">H</script>. This indicates an error in the fourth bit
and changing the fourth bit produces the correct codeword. See figure eleven for
an illustration.</p>
<p><img src="/assets/images/coding-theory-3/figure11.png" alt="Figure 11" class="img-center" /></p>
<p>Because the example code is only capable of correcting a single error, changing
more than one bit generates an irrecoverable codeword. However, with a more
complex code, it is possible to correct multiple errors using the distinct sum
of <script type="math/tex">H</script> rows and the nearest neighbor method. Again, the reader is encouraged
to expand on this with more research.</p>
<h2 id="conclusion">Conclusion</h2>
<p>This concludes the three-part series on coding theory. Coding theory is a
fascinating field that enables the reliable transfer of information in spite of
the shortcomings inherent in computing machinery. Richard Hamming, a pioneer in
the field, devised ingenious codes that allow a maximum amount of data recovery
using a minimum amount of redundancy. His codes are still widely used and have
many practical applications. This post demonstrated Hamming’s methods by
providing step-by-step instruction for generating codewords using a generator
matrix. Additionally, it illustrated how to derive a parity check matrix from
the generator matrix and use it to correct errors.</p>
<p>Thank you for taking the time to read this series of articles. As always, I’m
happy to answer any questions or embellish details in future posts upon request.
I hope this series has enthused the reader into more acute exploration.</p>Dale Alleshousedale@alleshouse.netWelcome to the final installment of this three-part series on coding theory. If you have not had the opportunity to read the first two pieces, it is highly recommended that you do before continuing on. They are available here: Coding Theory (Part 1 of 3) - Coding Theory Defined Coding Theory (Part 2 of 3) - Coding Theory Defined Having covered cogent concepts in previous posts, this article aims to dive into a demonstration which consists of defining a code using a generator matrix and correcting errors using a parity check matrix. The example is a bit contrived and thoroughly simplified for the sake of brevity. However, the intent is not to provide an exhaustive resource; it is to familiarize the reader with coding theory and hopefully entice him/her into further inquiry.Coding Theory (Part 2 of 3) - Perfect Error Correction2016-10-29T00:00:00+00:002016-10-29T00:00:00+00:00https://www.hideoushumpbackfreak.com/2016/10/29/Perfect-Error-Correction<p>Welcome to the second installment of this three-part series on coding theory.
If you have not had the opportunity to read the first piece, it is highly
recommended that you do before continuing on. It is available here: <a href="/2016/06/30/Coding-Theory-Defined.html">Coding
Theory (Part 1 of 3) - Coding Theory Defined</a></p>
<p>It’s rare to find concepts simple yet adroit at the same time. However,
Hamming’s contributions to coding theory “fits the bill”. This post begins with
a brief introduction to Hamming and a short history lesson before diving into
Hamming Distance, and Perfect Codes. Additionally, it delves into a few simple
math concepts requisite for understanding the final post. These concepts all
come together in the final installment by providing examples of how to generate
and decode the most powerful and efficient error correcting codes in use today.</p>
<!--more-->
<h2 id="richard-hamming">Richard Hamming</h2>
<p>Richard Hamming was an American mathematician that lived from 1915 thru 1998.
Early in his career, he programmed IBM calculating machines for the infamous
Manhattan project. Concerned about the pernicious effect he may be having on
humanity, he abandoned the Manhattan project to work for Bell Laboratories in 1946.
Hamming’s tenure at Bell Laboratories was illustrious. His contributions
during that time include Hamming codes, Hamming matrix, Hamming window, Hamming
numbers, Hamming bound, and Hamming distance. The impact of these discoveries
had irrevocable implications on the fields of computer science and
telecommunications. After leaving Bell Laboratories in 1976, Hamming went into
academia until his death in 1998.</p>
<h2 id="the-inception-of-error-correcting-codes">The Inception of Error Correcting Codes</h2>
<p>The world of computation was very different back in 1947. At that time,
producing modest (by today’s standards) calculations could take days. Just like
today, machines of yore operated on bit strings with parity bits to ensure data
fidelity. However, upon detecting erroneous data, the machines had no choice but
to halt computation and return an error result. Imagine the frustration of being
47 hours into a 48-hour program and having it error out due to an anomaly
introduced by noise. This is the dilemma Richard Hamming faced.</p>
<p>In 1950, Hamming published a paper that would serve as the basis for modern
coding theory. He postulated that it was possible to not only detect, but
correct errors in bit strings by calculating the number of bits disparate
between valid codes and the erroneous code. This came to be known as Hamming
Distance.</p>
<h2 id="hamming-distance">Hamming Distance</h2>
<p>The Hamming distance between two codewords is simply the number of bits that are
disparate between two bit strings as demonstrated in figure one. Typically,
hamming distance is denoted by the function <script type="math/tex">d(x,y)</script> where <script type="math/tex">x</script> and <script type="math/tex">y</script> are
codewords. This concept seems incredibly mundane on the surface, but it’s the
inception of a whole new paradigm in error correcting codes; specifically,
Nearest Neighbor error correction.</p>
<p><img src="/assets/images/coding-theory-2/figure1.png" alt="Figure 1" class="img-center" /></p>
<p>Nearest neighbor error correction involves first defining codewords, typically
denoted as <script type="math/tex">C</script>, that are known to both the source and sink. Any received
codeword not contained in <script type="math/tex">C</script> is obviously the result of noise. Upon
identifying an erroneous codeword, nearest neighbor decoding calculates the
Hamming distance between it and every codeword contained in <script type="math/tex">C</script>. The codeword
with the smallest Hamming distance has a high probability of being correct. See
figure two.</p>
<p><img src="/assets/images/coding-theory-2/figure2.png" alt="Figure 2" class="img-center" /></p>
<p>The quality of error correction is heavily dependent on choosing efficient
codewords. <script type="math/tex">d(C)</script> denotes Minimum Hamming Distance: that is the smallest
hamming distance between any two code words contained within <script type="math/tex">C</script>. If a code
has a minimum hamming distance of one (<script type="math/tex">d(C)=1</script>) then nearest neighbor error
correction is futile. If it has a large hamming distance, such as 10
(<script type="math/tex">d(C)=10</script>), then error correction is powerful.</p>
<p>Hamming represented the relationship between minimum hamming distance and the
quality of error correction with two concise equations. A particular code can
detect a maximum <script type="math/tex">k</script> errors in a codeword if <script type="math/tex">d(C)≤k+1</script> and correct a
maximum of <script type="math/tex">k</script> errors if <script type="math/tex">d(C)≥2k+1</script>. For example, a code with <script type="math/tex">d(C)=10</script>
can detect a maximum of nine errors and correct a maximum of four as
demonstrated in figure 3.</p>
<p><img src="/assets/images/coding-theory-2/figure3.png" alt="Figure 3" class="img-center" /></p>
<p>An important fact to note is that the equations above represent the maximum
bounds of error detection and correction. It is possible to create a code with a
minimum hamming distance that falls short of these bounds. In reality, it’s
difficult to create a code that effectuates the bounds. There are special codes,
known as Perfect Codes, that meet this criterion as well as demonstrate some
other desirable traits.</p>
<h2 id="perfect-codes">Perfect Codes</h2>
<p>Generating an efficient code is a formidable task because it involves three
competing principals as shown in figure four. First, short codewords reduce the
size of data transmissions. Likewise, as shown in the previous section, the
greater the minimum Hamming distance, the greater the codes ability to detect
and correct errors. However, there are a limited number of codewords of a
specified length that also have a specified minimum Hamming distance.</p>
<p><img src="/assets/images/coding-theory-2/figure4.png" alt="Figure 4" class="img-center" /></p>
<p>The Hamming Bound equation demonstrates these competing principals concisely.
The equation is shown in figure five, where <script type="math/tex">|C|</script> is the upper bound number of
codewords, <script type="math/tex">n</script> is the length of the codewords, and <script type="math/tex">k</script> is the maximum number
of errors it is capable of correcting. Any code that achieves the upper bound of
the equation is known as a Perfect Code. As a side note, Richard Hamming
developed a perfect code known now as Hamming Codes.</p>
<p><img src="/assets/images/coding-theory-2/figure5.png" alt="Figure 5" class="img-center" /></p>
<h2 id="conclusion">Conclusion</h2>
<p>This concludes the second installment of this three-part series on coding
theory. Richard Hamming created error correcting codes that addressed the
problem of brittle computations in the 1950s. However, they still permeate
modern computer science. The concept of Hamming Distance incepted Nearest
Neighbor error correction. The quality of error correction is dependent on the
Hamming Bound, which is an equation that expresses the three competing goals of
an effective code.</p>
<p>Make sure to check back for the final installment of this series. To date, the
posts have covered mostly supporting concepts. However, the concluding piece
agglomerates all ideas into a cohesive whole with an example. As always, thank
you for reading and feel free to contact me with questions or comments.</p>Dale Alleshousedale@alleshouse.netWelcome to the second installment of this three-part series on coding theory. If you have not had the opportunity to read the first piece, it is highly recommended that you do before continuing on. It is available here: Coding Theory (Part 1 of 3) - Coding Theory Defined It’s rare to find concepts simple yet adroit at the same time. However, Hamming’s contributions to coding theory “fits the bill”. This post begins with a brief introduction to Hamming and a short history lesson before diving into Hamming Distance, and Perfect Codes. Additionally, it delves into a few simple math concepts requisite for understanding the final post. These concepts all come together in the final installment by providing examples of how to generate and decode the most powerful and efficient error correcting codes in use today.Coding Theory (Part 1 of 3) - Coding Theory Defined2016-06-30T00:00:00+00:002016-06-30T00:00:00+00:00https://www.hideoushumpbackfreak.com/2016/06/30/Coding-Theory-Defined<p>Coding theory stands as a cornerstone for most of computer science. However,
many programmers today have a diminutive understanding of the field at best.
This three-part series of blog posts describes what coding theory is and delves
into Richard Hamming’s contributions. Although derived in the 1950s, Hamming’s
ideas are so visionary that they still permeate modern coding applications. If
a person truly comprehends Hamming’s work, they can fully appreciate coding
theory and its significance to computer science.</p>
<!--more-->
<p>This first installment of the series defines coding theory, error detecting
codes, and error correcting codes. These are all important supporting concepts
required to fully appreciate future articles. Although this is aimed at the
novice, it will provide a good review for the more seasoned computer scientist.</p>
<h2 id="coding-theory-defined">Coding Theory Defined</h2>
<p>Computer systems store information as a series of bits. Coding theory is the
study of encoding, transmitting, and decoding said information in a reliable
manner. More succinctly: moving bits with fidelity. This appears elementary
from the cursory view. What’s difficult about transferring ones and zeros
across some communications medium? As figure one illustrates, the answer is the
noise introduced by the communications channel.</p>
<p><img src="/assets/images/coding-theory-1/figure1.png" alt="Figure 1" class="img-center" /></p>
<p>Recall that computer systems store data as a strings of bits. Each bit has two
possible values. These values are often represented as 1/0, true/false, on/off,
or even high/low. Regardless of the nomenclature used to represent them, they
are nothing more than the absence or presence of a voltage from a computer’s
perspective. Noise, including everything from electrical interference to a
scratched disk surface, can make these values ambiguous to a machine.</p>
<p>As a grossly simplified example, suppose a computer expects either a zero or
five-volt signal. A zero-volt signal goes into one side of the channel and
distortion causes 2.6 volts to come out the other side. Therein lies the
ambiguity. The machine can only interpret the signal as a one and rely on
coding techniques to sort it out.</p>
<p>One important point to remember is that coding theory is requisite due to
shortcomings in modern computer hardware. If contemporary machines could
transmit data reliably, coding theory would be superfluous. It’s not that
building such equipment is impossible. The technology to build reliable
machines exists. It’s just not practical. Such computers would be slow and
exorbitantly expensive. Richard Hamming stated: “We use coding theory to escape
the necessity of doing things right because it’s too expensive to do it right”
(<a href="https://www.youtube.com/watch?v=vNpQL8jo4BI&feature=youtu.be&t=8m29s">Source</a>).</p>
<p>Coding theory addresses the inadequacies of machines by building fault
tolerance directly into transmitted data in the form of error detecting and
error correcting codes.</p>
<h2 id="error-detecting-codes">Error Detecting Codes</h2>
<p>Aptly named, error detecting codes enable receivers to determine if accepted
information contains errors. This is possible by appending calculated
verification data to the data source before transmission. The sender calculates
verification data by running the source data through a deterministic algorithm
which typically produces either a hash, checksum, or parity bit. Upon receipt,
the receiver runs the same algorithm on the information received. If the data
produced by the receiver matches the verification data, it’s safe to assume the
accepted information is unadulterated. Figure two shows the process more
concisely.</p>
<p><img src="/assets/images/coding-theory-1/figure2.png" alt="Figure 2" class="img-center" /></p>
<p>The concept of using codes for error detection is actually quite old. Scribes
in antiquity would sum the number of words in paragraphs and pages and use
those values to detect transcription errors. In that case, the original scroll
is the source and the produced scroll is the sink. The scribe himself is the
communication channel and source of noise. The algorithm used to generate the
verification data is the process of counting the words. Obviously, error
detecting is more complex in modern times but the general principal remains
unchanged.</p>
<p>Error detection goes beyond simply detecting errors introduced by noise; it can
also detect information tampering by malicious third parties. Although
fascinating, all of that minutiae is beyond the scope of this post. For
brevity, this article explores the simplest type of error detecting codes:
parity bits. This is the only error coding concept particularly germane to
future installments in this series.</p>
<p>A parity bit (aka check bit) is a verification bit appended to the end of a
codeword. The parity bit equals zero if there are an even number of ones and
one if there are an odd number. Figure 3 illustrates this concept. As a side
note, what is described above is technically an “even” parity bit. The bit
value will be the opposite in the case of an odd parity bit. The remainder of
this article assumes even parity bits.</p>
<p><img src="/assets/images/coding-theory-1/figure3.png" alt="Figure 3" class="img-center" /></p>
<p>Parity bits can only detect an odd number of errors. Consider the seven bit
codewords above. The parity bit is only useful if there are one, three, five,
or seven errors. If there are two, four, or six error, the parity bit indicates
success. One method for mitigating this is by arranging the data into a matrix
and generating parity bits in multiple dimensions as shown in figure four.</p>
<p><img src="/assets/images/coding-theory-1/figure4.png" alt="Figure 4" class="img-center" /></p>
<p>The example shown above has two dimensions; however, it’s possible to add
parity bits in unlimited dimensions. While it’s fairly easy to imagine a matrix
with three dimensions, it’s arduous to visualize a matrix with more than that.
Regardless, it’s mathematically feasible. A future article in this series
examines this in more detail.</p>
<p>Error detecting codes inform the receiver of errors during the transmission of
information. Knowing there is an error, the receivers can easily make a request
to resend data. Many systems work exactly like this. The next section explores
how coding theory takes this one step farther by not only detecting errors, but
correcting them as well.</p>
<h2 id="error-correcting-codes">Error Correcting Codes</h2>
<p>The previous section describes how receivers request a resend upon detecting
errors with error detecting codes. Unfortunately, there are applications where
this isn’t an option. For instance, imagine trying to communicate with a
satellite in deep space when the transmission process could take months.
Another example is data stored on a disk that may degrade over time. It’s
impossible to ask for a retransmission from the source because the source
itself is corrupted. Yet another example is broadcast systems where there is no
backchannel to facilitate resend requests. These are just a few examples. For
such cases there are error correcting codes which not only inform the receiver
of errors, they contain enough information to fix them.</p>
<p>The simplest form of error correcting codes are repetition codes. As the name
implies, the message is simply replicated multiple times. The decoder
determines the correct bits by choosing the majority. Figure five illustrates
the concept. The amount of duplication is implementation dependent; however,
less than thrice is not effective.</p>
<p><img src="/assets/images/coding-theory-1/figure5.png" alt="Figure 5" class="img-center" /></p>
<p>There are more elegant and efficient error correction paradigms than repetition
codes. However, they are still in use in some modern system due to the ease of
implementation. The main take away from this section is simply what error
correcting codes are. Future installments examine them in greater detail.</p>
<h2 id="conclusion">Conclusion</h2>
<p>This concludes the first installment of this three-part series on coding
theory. This article introduced coding theory, error detecting codes, and error
correcting codes. In short, the concepts required to fully appreciate future
posts. Future installments dig into details of coding theory and explore the
works of Richard Hamming, who revolutionized the field in the 1950s.</p>
<p>Make sure to come back for the next article because that’s when things start to
get exciting. The post digs into some fascinating math and the more ingenious
methods used for error correction. As always, thank you for reading and feel
free to contact me with questions or comments.</p>Dale Alleshousedale@alleshouse.netCoding theory stands as a cornerstone for most of computer science. However, many programmers today have a diminutive understanding of the field at best. This three-part series of blog posts describes what coding theory is and delves into Richard Hamming’s contributions. Although derived in the 1950s, Hamming’s ideas are so visionary that they still permeate modern coding applications. If a person truly comprehends Hamming’s work, they can fully appreciate coding theory and its significance to computer science.