Total Pageviews

Saturday, 17 February 2018

Dimensions of Complexity of AI


Dimensions of Complexity
Agents acting in environments range in complexity from thermostats to companies with multiple goals acting in competitive environments. A number of dimensions of complexity exist in the design of intelligent agents. These dimensions may be be considered separately but must be combined to build an intelligent agent. These dimensions define a design space of AI; different points in this space can be obtained by varying the values of the dimensions.
Here we present nine dimensions: modularity, representation scheme, planning horizon, sensing uncertainty, effect uncertainty, preference, number of agents, learning, and computational limits. These dimensions give a coarse division of the design space of intelligent agents. There are many other design choices that must be made to build an intelligent agent.
·         1 Modularity
·         2 Representation Scheme
·         3 Planning Horizon
·         4 Uncertainty
o    4.1 Sensing Uncertainty
o    4.2 Effect Uncertainty
·         5 Preference
·         6 Number of Agents
·         7 Learning
·         8 Computational Limits
·         9 Interaction of the Dimensions

8 Computational Limits

Sometimes an agent can decide on its best action quickly enough for it to act. Often there are computational resource limits that prevent an agent from carrying out the best action. That is, the agent may not be able to find the best action quickly enough within its memory limitations to act while that action is still the best thing to do. For example, it may not be much use to take 10 minutes to derive what was the best thing to do 10 minutes ago, when the agent has to act now. Often, instead, an agent must trade off how long it takes to get a solution with how good the solution is; it may be better to find a reasonable solution quickly than to find a better solution later because the world will have changed during the computation.
The computational limits dimension determines whether an agent has
  • perfect rationality, where an agent reasons about the best action without taking into account its limited computational resources; or
  • bounded rationality, where an agent decides on the best action that it can find given its computational limitations.
Computational resource limits include computation time, memory, and numerical accuracy caused by computers not representing real numbers exactly. An anytime algorithm is an algorithm whose solution quality improves with time. In particular, it is one that can produce its current best solution at any time, but given more time it could produce even better solutions. We can ensure that the quality doesn't decrease by allowing the agent to store the best solution found so far and return that when asked for a solution. However, waiting to act has a cost; it may be better for an agent to act before it has found what would have been the best solution.

figures/ch01/anytimeplot.gif
Figure 1.5: Solution quality as a function of time for an anytime algorithm. The agent has to choose an action. As time progresses, the agent can determine better actions. The value to the agent of the best action found so far, if it had been carried out initially, is given by the dashed line. The reduction in value to the agent by waiting to act is given by the dotted line. The net value to the agent, as a function of the time it acts, is given by the solid line.

Example 1.9: Figure 1.5 shows how the computation time of an anytime algorithm can affect the solution quality. The agent has to carry out an action but can do some computation to decide what to do. The absolute solution quality, had the action been carried out at time zero, shown as the dashed line at the top, is improving as the agent takes time to reason. However, there is a penalty associated with taking time to act. In this figure, the penalty, shown as the dotted line at the bottom, is proportional to the time taken before the agent acts. These two values can be added to get the discounted quality, the time-dependent value of computation; this is the solid line in the middle of the graph. For the example of Figure 1.5, an agent should compute for about 2.5 time units, and then act, at which point the discounted quality achieves its maximum value. If the computation lasts for longer than 4.3 time units, the resulting discounted solution quality is worse than if the algorithm just outputs the initial guess it can produce with virtually no computation. It is typical that the solution quality improves in jumps; when the current best solution changes, there is a jump in the quality. However, the penalty associated with waiting is often not as simple as a straight line.

To take into account bounded rationality, an agent must decide whether it should act or think more. This is challenging because an agent typically does not know how much better off it would be if it only spent a little bit more time reasoning. Moreover, the time spent thinking about whether it should reason may detract from actually reasoning about the domain. However, bounded rationality can be the basis for approximate reasoning.

9 Interaction of the Dimensions

Figure 1.6 summarizes the dimensions of complexity. Unfortunately, we cannot study these dimensions independently because they interact in complex ways. Here we give some examples of the interactions.

DimensionValues
Modularityflat, modular, hierarchical
Representation schemestates, features, relations
Planning horizonnon-planning, finite stage,
indefinite stage, infinite stage
Sensing uncertaintyfully observable, partially observable
Effect uncertaintydeterministic, stochastic
Preferencegoals, complex preferences
Learningknowledge is given, knowledge is learned
Number of agentssingle agent, multiple agents
Computational limitsperfect rationality, bounded rationality
Figure 1.6: Dimensions of complexity

The representation dimension interacts with the modularity dimension in that some modules in a hierarchy may be simple enough to reason in terms of a finite set of states, whereas other levels of abstraction may require reasoning about individuals and relations. For example, in a delivery robot, a module that maintains balance may only have a few states. A module that must prioritize the delivery of multiple parcels to multiple people may have to reason about multiple individuals (e.g., people, packages, and rooms) and the relations between them. At a higher level, a module that reasons about the activity over the day may only require a few states to cover the different phases of the day (e.g., there might be three states: busy time, available for requests, and recharge time).
The planning horizon interacts with the modularity dimension. For example, at a high level, a dog may be getting an immediate reward when it comes and gets a treat. At the level of deciding where to place its paws, there may be a long time until it gets the reward, and so at this level it may have to plan for an indefinite stage.
Sensing uncertainty probably has the greatest impact on the complexity of reasoning. It is much easier for an agent to reason when it knows the state of the world than when it doesn't. Although sensing uncertainty with states is well understood, sensing uncertainty with individuals and relations is an active area of current research.
The effect uncertainty dimension interacts with the modularity dimension: at one level in a hierarchy, an action may be deterministic, whereas at another level, it may be stochastic. As an example, consider the result of flying to Paris with a companion you are trying to impress. At one level you may know where you are (in Paris); at a lower level, you may be quite lost and not know where you are on a map of the airport. At an even lower level responsible for maintaining balance, you may know where you are: you are standing on the ground. At the highest level, you may be very unsure whether you have impressed your companion.
Preference models interact with uncertainty because an agent must have a trade-off between satisfying a major goal with some probability or a less desirable goal with a higher probability. This issue is explored in Section 9.1.
Multiple agents can also be used for modularity; one way to design a single agent is to build multiple interacting agents that share a common goal of making the higher-level agent act intelligently. Some researchers, such as Minsky (1986), argue that intelligence is an emergent feature from a "society" of unintelligent agents.
Learning is often cast in terms of learning with features - determining which feature values best predict the value of another feature. However, learning can also be carried out with individuals and relations. Much work has been done on learning hierarchies, learning in partially observable domains, and learning with multiple agents, although each of these is challenging in its own right without considering interactions with multiple dimensions.
Two of these dimensions, modularity and bounded rationality, promise to make reasoning more efficient. Although they make the formalism more complicated, breaking the system into smaller components, and making the approximations needed to act in a timely fashion and within memory limitations, should help build more complex systems.

Monday, 12 February 2018

Sparkling Water = H20 + Apache Spark






H20 – The Killer-App on Apache Spark

In-memory big data has come of age. The Apache Spark platform, with its elegant API, provides a unified platform for building data pipelines. H2O has focused on scalable machine learning as the API for big data applications. Spark + H2O combines the capabilities of H2O with the Spark platform – converging the aspirations of data science and developer communities. H2O is the Killer-Application for Spark.

Backdrop
Over the past few years, we watched Matei and the team behind Spark build a thriving open-source movement and a great development platform optimized for in-memory big data, Spark. At the same time, H2O built a great open source product with a growing customer base focused on scalable machine learning and interactive data science. These past couple of months the Spark and H2O teams started brainstorming on how to best combine H2O’s Machine Learning capabilities with the power of the Spark platform. The result: Sparkling Water.

Sparkling Water
Users can in a single invocation and process, get the best of Spark – its elegant APIs, RDDs, multi-tenant Context and H2O’s speed, columnar-compression and fully-featured Machine Learning and Deep-Learning algorithms.
One of the primary draws for Spark is its unified nature, enabling end-to-end building of API’s within a single system. This collaboration is designed to seamlessly enable H20’s advanced capabilities to be part of that data pipeline. The first step in this journey is enabling in-memory sharing through Tachyon and RDDs. The roadmap includes deeper integration where H2O’s columnar-compressed capabilities can be natively leveraged through ‘H2ORDD’.
Today, data gets parsed and exchanged between Spark and H2O via Tachyon. Users can interactively query big data both via SQL and ML from within the same context.
Sparkling Water enables use of H2O’s Deep Learning and Advanced Algorithms for Spark’s user community. H2O as the killer-application provides a robust machine learning engine and API for the Spark Platform. This will further empower application developers on Spark to build intelligent and smarter applications.
MLlib and H2O: The Triumph of Open Source!



MLlib is a library of efficient implementations of popular algorithms directly built using Spark. We believe that enterprise customers should have the choice to select the best tool for meeting their needs in the context of Spark. Over time, H2O will accelerate the community’s efforts towards production ready scalable machine learning. Fast fully featured algorithms in H2O will add to growing open source efforts in R, MLlib, Mahout and others, disrupting closed and proprietary vendors in machine-learning and predictive analytics.
Natural integration of H2O with the rest of Spark’s capabilities is a definitive win for enterprise customers.
Demo Code
package water.sparkling.demo

import water.fvec.Frame
import water.util.Log
import hex.gbm.GBMCall.gbm

object AirlinesDemo extends Demo {
  override def run(conf: DemoConf): Unit = {
    // Prepare data
    // Dataset
    val dataset = "data/allyears2k_headers.csv"
    // Row parser
    val rowParser = AirlinesParser
    // Table name for SQL
    val tableName = "airlines_table"
    // Select all flights with destination == SFO
    val query = """SELECT * FROM airlines_table WHERE dest="SFO" """

    // Connect to shark cluster and make a query over prostate, transfer data into H2O
    val frame:Frame = executeSpark[Airlines](dataset, rowParser, conf.extractor, tableName, query, local=conf.local)
    Log.info("Extracted frame from Spark: ")
    Log.info(if (frame!null) frame.toString + "\nRows: " + frame.numRows() else "")

    // Now make a blocking call of GBM directly via Java API
    val model = gbm(frame, frame.vec("isDepDelayed"), 100, true)
    Log.info("Model built!")
  }

  override def name: String = "airlines"
}