Introduction to Distributed Programming
Time for quick intro into what we mean by distributed programming.
Definitions
- Distributed Computing
- Computing on a distributed system
- Distributed System
- A system of computers communicating via messages over
a network so as to cooperate on a task or tasks. There's
no physical shared memory in a distributed system, though
algorithms can simulate such a thing.
Areas of Study
Here's five (of many, I'd guess):
- Networks and Internets
- Distributed Algorithms
- Paradigms
- Enterprise Computing
- Grid Computing
Networks and Internets
See this page.
Distributed Algorithms
Distributed algorithms are designed for programming distributed systems.
They differ from centralized algorithms because they are unaware of
any global state or a global time frame.
Issues:
- Modeling: transition systems, statecharts, temporal logic
- Communication, Timing, and Synchronization
- Routing Algorithms
- Virtual Circuits and Packet Switching
- Kinds of algorithms: wave algorithms, traversal algorithms,
election algorithms, snapshot algorithms
- Distributed Termination Detection
- Distributed Deadlock Detection
- Distributed Failure Detection
- Stabilization
Distributed Computing Paradigms
- Client-server
- Multi-tier
- Peer-to-peer
- Publish/subscribe
- RPC
- Distributed Objects
- Object Spaces
- Mobile Agents
- Network Services
- Groupware
Exercise: Research
these paradigms. Write a survey paper covering all of these,
and any more you find. Provide examples, comparisons, and lots
of references. Make the paper of publishable quality.
Enterprise Computing
Enterprise applications are applications that run on large servers with
multiple (simultaneous) users communicating over a network via clients like
web browsers, PDAs, cell phones, or desktop applications. These applications
generally read from and write to big databases.
Some people say enterprise applications are only for business
functions (accounting, customer management, product tracking, etc.);
some say any big distributed application counts as "enterprise".
Enterprise Computing Platforms
There's legacy code out there — COBOL, IMS, CICS.
But most current work is done in two:
- Java EE (also known as: Java Platform, Enterprise Edition)
- .NET (pronounced "dot net")
They didn't start off terribly different, and they're probably
evolving toward each other. (Just like Java and C# are.)
Java EE | .NET |
- Runs on a JVM
- From Sun
- Fully implemented on many operating systems
- Maintained and enhanced by the Java Community Process (comprised
of hundreds of companies and organizations)
- Source code for the entire framework freely available
- Mature
- Kind of a standard
|
- Runs on the CLR (Common Language Runtime)
- From Microsoft
- Fully implemented on Windows; partially implemented on other operating
systems
- Microsoft-maintained and enhanced
- Some source code is proprietary
- Mature
- Kind of a marketing strategy; however, some "components" are
official standards (e.g. C#)
|
Enterprise Architectures
In the old days, and today for the most trivial of applications,
we see client-server organizations.
Two tier architectures are almost always way too fragile.
They soon gave way to three-tier architectures:
The idea here is that any one of the three layers can be completely
re-implemented without affecting the others.
The middle layer completely isolates the front end from
any knowledge of the database. The UI doesn't even know
what the data source is. It just makes calls like
fetchCustomerById(24337)
.
Software running in the middle tier is called middleware.
Middleware products are also called containers, since
they host and manage the business objects. They can manage lifecycles,
transactions, memory, authentication, concurrency, distribution,
security, sessions, resource pooling, logging and lots of other "system-level
plumbing things" so developers only have to concentrate
on business logic.
There's no need to stop at three tiers. You'll often hear the
term n-tier.
Sometimes applications are classified by the complexity of the
client:
Thick Client | Thin Client |
- Customized client application
- Probably a rich GUI
- Runs on a desktop (but could be delivered via WebStart)
- In two-tier architecture, has too much business logic
- In two-tier architecture, may have embedded database calls
|
- Client probably just a web browser
- Can make use of a web container's database pooling and other
helpful offerings.
- Probably a weak GUI, but new technologies (e.g. Ajax) helping a lot!
- In two-tier architecture, might have database calls
embedded in a web page
|
Grid Computing
The term grid computing refers to the computation of highly
compute-intensive algorithms (protein folding, SETI, earthquake
simulation, climate modeling) over many computers across
administrative domains. Most of the computers run similar
code; they're all contributing bits toward the overall solution.
More at Wikipedia's
article on Grid Computing.
Exercise: Research and write
about the differences between grid computing and cluster-based computing.
Mention how a cluster is different from a grid.
.