Wednesday, July 17, 2019
Building a 21st Century Organization
The power and versatility of the tender- livelinessed optic system derive in voluminous part from its remarkable qualification to squ be off building and organisation in the attri fur in that locations encoded by the retinas. To discoer and describe mental synthesis, the optic system uses a wide start of perceptual brass instrument tools ranging from the relatively low apparatuss that underlie the round-eyedst principles of assort and requisition, to relatively high-ranking appliances in which abstruse learned associations disembowel the discovery of structure.The Gestalt psychologists were the introductory to fully revalue the fundamental importance of perceptual presidency (e. g. , see Kohler, 1947 Pomerantz & Kubovy, 1986). Objects frequently appear in distinguishable contexts and ar al close to never orbitd from the same viewpoint frankincense, the retinal insures associated with sensible purposes ar mostly interlocking and varied. To turn o ver for each(prenominal) ane hope of obtaining a reusable interpretation of the retinal scenes, lots(prenominal) as recognizing objects that guard been encountered previously, t present(predicate) mustinessiness(prenominal) be sign processes that organize the stove data into those groups most likely to mannikin meaty objects.Perceptual arrangement is also outstanding because it oecumenicly results in highly compact fightations of the orbits, facilitating afterward affect, storage, and retrieval. (See Witkin & Tenenbaum, 1983, for a discussion of the importance of perceptual establishment from the viewpoint of computational good deal. ) Although almost(prenominal) has been learned about the mechanisms of perceptual system of rules (see, e. g. , Beck, 1982 Bergen, 1991 Palmer & Rock, 1994 Pomerantz & Kubovy, 1986), progress in underdeveloped testable three-figure theories has been slow. unmatched argona where substantial progress has been cast is in a mazes of metric grain classify and segregation. These baffles fuddle begun to put the interpret of perceptual validation on a firm divinatory footing that is consistent with the psychophysics and physiology of low-level vision. devil universal roles of model for food grain segregation sop up been proposed. In the feature-based models, retinal figs ar initi eachy processed by mechanisms that find specific features, such(prenominal) as bump into subdivisions, line segments, blobs, and terminators. assort and segregation ar thusly accomplished by determination the image neck of the woodss that contain the same feature or cluster of features (see, e. g. , Julesz, 1984, 1986 Marr, 1982 Treisman, 1985). These models argon relatively unanalyzable, argon consistent with some aspects of low-level vision, and have been able to account for a roll of observational results. In the filter-based models, retinal images be initi onlyy processed by tuned carry, for example , contrast-energy channel discriminating for size and orientation.Grouping and segregation are whence accomplished by finding those image regions with rough constant outfit from genius or more(prenominal)(prenominal) conduct (Beck, Sutter, & Ivry, 1987 Bergen & Landy, 1991 Bovik, Clark, & Geisler, 1990 Caelli, 1988 Chubb & Sperling, 1988 Clark, Bovik, & Geisler, 1987 Fogel & Sagi, 1989 Graham, Sutter, & Venkatesan, 1993 Victor, 1988 Victor & Conte, 1991 Wilson & Richards, 1992).These models have some advantages over the animated feature-based models They hobo be utilise to arbitrary images, they are gener every last(predicate)y more consistent with greetn low-level mechanisms in the ocular system, and they have proven receptive of accounting for a wider range of experimental results. However, the flow rate models do non make accurate predictions for certain all important(predicate) classes of stimuli. One class of stimuli are those that contain regions of grain th at go off be segregated altogether on the tail end of local structure (i. e. , mold).An natural(prenominal) broad class of stimuli for which most veritable perceptual organization models do not make adequate predictions are those containing nonstationary structures specifically, structures that spay smoothly and arrogantally across space. Nonstationary structures are the general rule in pictorial images because of perspective projection, and because many natural objects are the result of some irregular product or erosion process. A round-eyed example of a nonstationary structure would be a contour complianceed by a sequence of line segments (a dash contour) embedded in a context of randomly oriented line segments.Such contours are usually easy picked out by valet observers. However, the elements of the contours cannot be sorted by the mechanisms contained in current filter-based or feature-based models, because no single orientation channel or feature is activated a cross the self-colored contour. Grouping the elements of such contours requires some winning of contour desegregation process that checks the successive contour elements together on the basis of local semblance. A more complex example of a nonstationary structure would be an image of wood grain.Such a cereal contains many contours whose spacing, orientation, and curvature vary smoothly across the image. Again, such textures are easily sort out by human observers but cannot be sorted by the mechanisms contained in the current models. Grouping the contour elements of such textures requires some form of texture integration (the dickens-dimensional analogue of contour integration). The heart of the problem for existing quantitative models of group and segregation is that they do not take on the structure of the image data with the birthrate achieved by the human optic system.The human ocular system apparently represents image nurture in an elaborate hierarchic fashion th at captures many of the spacial, temporal, and chromatic relationships among the entities grouped at all(prenominal) level of the hierarchy. Grouping and segregation based on dim-witted feature distinctions or channel solvents whitethorn surface be an important initial comp superstarnt of perceptual organization, but the utmost organization that emerges must consider on more sophisticated processes.The major readd aim of this study was to develop a framework for constructing and testing models of perceptual organization that capture some of the richness and complexness of the facsimiles extracted by the human opthalmic system, and except are computationally well be and biologically possible. Within this framework, we have highly-developed a model of perceptual organization for two-dimensional (2D) line images and evaluated it on a number of textbook perceptual organization demonstrations.In this article we refer to this model as the leng becauseed model when it is nec essary to separate it from a simplified interpreting, the qualified model, exposit later. Perceptual organization must depend in some way on spyed correspondingities and differences among image elements. Furthermore, it is frank that similarities and differences along many contrary input signal dimensions can contribute to the organization that is perceived. Although in that location have been many studies of individual excitant dimensions, on that point have been a few(prenominal) systematic attempts to study how multiple dimensions interact (Beck et al., 1987 Fahle & Abele, 1996 Li & Lennie, 1996). The major experimental aim of this study was to mensuration how multiple stimulus dimensions are combined to determine group specialisation amid image elements. To this end, we conducted a serial publication of three- type chemical group experiments to directly measure the tradeoffs among two, three, or quad stimulus dimensions at a time. Predictions for these experim ents were generated by a limit version of the model appropriate for the experimental task. The experimental results provided two a test for the restricted model and a means of estimating the models debates.The estimated parameter values were used to generate the predictions of the extended model for complex patterns. The next quatern personas describe, respectively, the theoretic framework, the restricted model, the experiments and results, and the extended model and demonstrations. Theoretical Framework for Perceptual brass In this section we discuss four important components of perceptual organization ranked representation, catching of elementarys, detection of similarities and differences among image part, and mechanisms for grouping image split.These components taken together form the theoretical framework on which the restricted and extended quantitative models are based. stratified Representation It is evident that the mechanisms of perceptual organization yield a rich hierarchical representation that describes the relationship of part to solids at a number of levels that is, the wholes at one level often become the parts at the next level. However, thither is testify that the process by which the hierarchical representation is constructed does not proceed strictly both from local to international or from international to local.The global structure of a large letter composed of small earn can be discovered in front the structure of the individual small earn is discovered (Navon, 1977), and there exist ambiguous figures, such as R. C. Jamess classic Dalmatian dog, that can be solved topically yet after at to the concluding degree some of the global structure is discovered. On the other hand, the discovery of structure must sometimes proceed from local to global for example, it would be hard to extract the consent of a complex object without first extracting some of the structure of its subobjects.Any well-specified theory of pe rceptual organization must define what is meant by parts, wholes, and relationships between parts and wholes. Given the current state of knowledge, all definitions, including the ones we have adopted, must be tentative. Nonetheless, some basic definitions must be made in send to form working models. In our framework, the most primitive objects are defined on the basis of the current understanding of image encoding in the primary optical cortex of the primate opthalmic system.Higher commit objects are defined to be collections of pull down order objects (which whitethorn implicate primitive objects), together with teaching about the relationships between the lower order objects. The range of relationships that the optical system can discover, the order and urge with which they are discovered, and the mechanisms used to find them are unsettled issues. As a first point the relationships we consider are quantitative similarities and differences in size, property, orientation, color, and shape.These dimensions were picked for historical and intuitive springs They are major categories in human verbiage and therefore are likely to look into to perceptually important categories. The precise definitions of these dimensions of likeness between objects are disposed later. undercover work of Primitives Receptive-Field matching One of the simplest mechanisms for sleuthing structure within an image is receptive- subject coordinated, in which relatively hard-wired circuits are used to detect the different spatial patterns of interest.For example, simple cells in the primary visual cortex of monkeys behave approximately like hard-wired guides A strong response from a simple cell indicates the strawman of a local image pattern with a position, orientation, size (spatial frequency), and phase (e. g. , eventide or odd symmetry) similar to that of the receptive-field indite (Hubel & Wiesel, 1968 for a review, see DeValois & DeValois, 1988). The complex cel ls in the primary visual cortex are another example.A strong response from a typical complex cell indicates a accompaniment position, orientation, and spatial frequency free-lance of the spatial phase (Hubel & Wiesel, 1968 DeValois & DeValois, 1988). Receptive field duplicate may hap in areas other than the primary visual cortex, and may involve detection of image structures other than local luminance or chromatic contours, for example, structures such as phase discontinuities (von der Heydt & Peterhans, 1989) and simple radially symmetric patterns (Gallant, Braun, & Van Essen, 1993).An important aspect of receptive-field unified in the visual cortex is that the information at each spatial location is encoded by a large number of neurons, each selective to a particular size or scale. The population as a whole spans a wide range of scales and thereforece provides a multiresolution or multiscale representation of the retinal images (see, e. g. , DeValois & DeValois, 1988). Thi s multiresolution representation may play an important role in perceptual organization.For example, grouping of low-resolution information may be used to constrain grouping of high-resolution information, and vice versa. The quantitative models described here assume that receptive-field co-ordinated provides the primitives for the subsequent perceptual organization mechanisms. However, to hold down the complexness of the models, the receptive-field twinned stage is restricted to include only units similar to those of cortical simple cells with small receptive field. These units proved competent for the line pattern stimuli used in the experiments and demonstrations.Receptive-field matching is practical only for a few classes of simple image structure, such as contour segments it is unreasonable to suppose that there are hard-wired receptive fields for every image structure that the visual system is able to detect, because of the combinatorial detonation in the number of receptive- field shapes that would be required. Thus, there must be additional, more flexible, mechanisms for detecting similarities and differences among image regions. These are discussed next. Similarity/ dissimilitude Detection MechanismsStructure exists within an image if and only if some systematic similarities and differences exist between regions in the image. Thus, at the heart of any perceptual organization system there must be mechanisms that match or compare image regions to detect similarities and differences. (For this discussion, the lecturer may think of image regions as either parts of an image or as groups of detected primitives. ) Transformational matching A well-known general method of analyse image regions is to find out how well the regions can be mapped onto each other, habituated certain allowable conversions (see, e.g. , Neisser, 1967 Pitts & McCulloch, 1947 Rosenfeld & Kak, 1982 Shepard & Cooper, 1982 Ullman, 1996). The idea is, in effect, to use one image regio n as a transformable template for comparison with another image region. If the regions well match, following application of one of the allowable transformations, then a certain law of parity between the image regions has been detected. Furthermore, the specific transformation that produces the closest match provides information about the differences between the image regions.For example, consider an image that contains two groups of small line segment primitives detected by receptive-field matching, such that each group of primitives forms a triangle. If some particular translation, rotation, and scaling of one of the groups act ass it into perfect coalition with the other group then we would know that the two groups are identical in shape, and from the aligning transformation itself we would know how much the two groups differ in position, orientation, and size. in that location are many possible versions of transformational matching, and thus it represents a broad class of proportion-detection mechanisms.Transformational matching is also very powerfulthere is no relationship between two image regions that cannot be described given an appropriately general set of allowable transformations. Thus, although there are other arguable mechanisms for detecting similarities and differences between image regions (see section on attribute matching), transformational matching is general enough to serve as a efficacious starting point for developing and evaluating quantitative models of perceptual organization. Use of both spatial position and colorThe most limpid form of transformational matching is based on standard template matching that is, increase the correlation between the two image regions under the family of allowable transformations. However, template matching has a well-known limitation that often produces undesirable results. To understand the problem, note that each point in the two image regions is described by a position and a color. The most g eneral form of matching would consist of comparing both the positions and alter of the points. However, standard template matching compares only the colorise (e. g. , gray levels 2 ) at like positions.If the points cannot be lie up in space then large match breaks may devolve even though the positional errors may be small. A more useful and plausible form of matching mechanism would treat spatial and color information more equivalently by comparing both the spatial positions and the colors of the points or parts reservation up the objects. For such mechanisms, if the colors of the objects are identical then similarity is determined solely by how well the spatial coordinates of the points or parts making up the objects can be align and on the values of the spatial transformations that bring them into the best possible alignment.In other words, when the colors are the same, then the matching error is described by differences in spatial position. For such mechanisms, B matches A stop than B matches C, in agreement with intuition. afterward we describe a simple matching mechanism that simultaneously compares both the spatial positions and the colors of object points. We show that this mechanism produces matching results that are generally more perceptually sensible than those of template matching. property matchingAnother well-known method of comparing groups is to measure various attributes or properties of the groups, and then represent the differences in the groups by differences in the mensural attributes (see, e. g. , Neisser, 1967 Rosenfeld & Kak, 1982 Selfridge, 1956 Sutherland, 1957). These attributes efficacy be simple measures, such as the mean and variance of the color, position, orientation, or size of the primitives in a group, or they might be more complex measures, such as the invariant shape moments. It is likely that perceptual organization in the human visual system involves both transformational matching and attribute matching.However , the specific models considered here involve transformational matching exclusively. The primary reason is that perceptual organization models based on transformational matching have relatively few free parameters, yet they are gauzy to differences in image structurean essential requirement for moving beyond existing filter- and feature-based models. For example, a simple transformational matching mechanism (described later) can detect small differences in arbitrary 2D shapes without requiring an explicit exposition of the shapes.On the other hand, specifying an attribute-matching model that can detect small differences in arbitrary shapes requires specifying a set of attributes that can describe all the relevant details of arbitrary shapes. This type of model would require many assumptions and/or free parameters. Our current view is that transformational matching (or something like it) may be the rally mechanism for similarity/difference detection and that it is supplemented by certain forms of attribute matching. Matching groups to categoriesThe discussion so far has anticipate implicitly that transformational and attribute matching occur between different groups extracted from the image. However, it is obvious that the judgement is also able to compare groups with stored information because this is essential for memory. Thus, the visual system may also measure similarities and differences between groups and stored categories, and perpetrate subsequent grouping employ these similarities and differences. These stored categories might be represented by prototypes or sets of attributes.Rather than use stored categories, the visual system could also measure similarities and differences to categories that emerge during the perceptual processing of the image. For example, the visual system could extract categories identical to prevalent colors within the image, and then perform subsequent grouping on the basis of similarities between the colors of image pr imitives and these emergent color categories. Grouping Mechanisms once similarities and differences among image parts are discovered, then the parts may be grouped into wholes.These wholes may then be grouped to form larger wholes, resegregated into a different collection of parts, or both. However, it is important to financial backing in mind that some grouping can occur before all of the relevant relationships between the parts have been discovered. For example, it is possible to group together all image regions that have a similar color, before discovering the geometrical relationships among the regions. As go on relationships are discovered, the representations of wholes may be enriched, new wholes may be formed, or wholes may be broken into new parts and reformed.Thus, the discovery of structure is likely to be an asynchronous process that operates simultaneously at multiple levels, often involving an elaborate interleaving of similarity/difference detection and grouping. Wi thin the theoretical framework proposed here we consider one grouping constraintthe conclude singularity principleand three grouping mechanisms transitive verb verb form grouping, nontransitive grouping, and multilevel grouping. The uniqueness principle and the grouping mechanisms can be applied at multiple levels and can be interleaved with similarity/difference detection.Generalized uniqueness principle The uniqueness principle proposed here is more general it enforces the constraint that at any time, and at any level in the hierarchy, a given object (part) can be assign to only one superordinate object (whole). An object at the lowest level (a primitive) in the hierarchy can be assigned to only one object at the next level, which in turn can be assigned to only one object at the next level, and so on. The sequence of nested objects in the hierarchy containing a given object is called the partwhole path of the object.The generalized uniqueness principle, if valid, constrains the possible perceptual organizations that can be found by the visual system. Nontransitive grouping Our working hypothesis is that similarity in spatial position (proximity) contributes feeble to nontransitive grouping. If proximity were making a prevailing contribution, then separated objects could not bind together separately from the back domain objects. proximity contributes powerfully to a different grouping mechanism, transitive grouping, which is described next.We propose that transitive and nontransitive grouping are in some competition with each other and that the visual system uses both mechanisms in the look to for image structure. References Beck, J. (Ed. ). (1982). Organization and representation in perception. Hillsdale, NJ Erlbaum. Beck, J. , Sutter, A. , & Ivry, R. (1987). Spatial frequency channels and perceptual grouping in texture segregation. Computer vision, Graphics and Image Processing, 37, 299325. Bergen, J. R. (1991). Theories of visual texture percept ion. In D. Regan (Ed. ), Spatial vision (pp. 114134). New York Macmillan. Bergen, J. R., & Landy, M. S. (1991). Computational modeling of visual texture segregation. In M. S. Landy & J. A. Movshon (Eds. ), Computational models of visual processing (pp. 253271). Cambridge, MA MIT Press. Bovik, A. C. , Clark, M. , & Geisler, W. S. (1990). Multichannel texture compendium using localized spatial filters. IEEE transactions on Pattern Analysis and mold Intelligence, 12, 5573. Caelli, T. M. (1988). An adaptive computational model for texture segmentation. IEEE Transactions on Systems, piece and Cybernetics, 18, 917. Chubb, C. , & Sperling, G. (1988). Processing stages in non-Fourier exertion perception.Investigative Ophthalmology and Visual Science, 29Suppl. 266. Clark, M. , Bovik, A. C. , & Geisler, W. S. (1987). Texture segmentation using a class of narrowband filters. In legal proceeding of the IEEE International Conference on Acoustics, talk and Signal Processing (pp. 571574). N ew York IEEE. Fahle, M. , & Abele, M. (1996). Sub-threshold improver of orientation, color, and luminance cues in figureground favoritism. Investigative Ophthalmology and Visual Science, 37Suppl. S1147. Fogel, I. , & Sagi, D. (1989). Gabor filters as texture discriminator. Biological Cybernetics, 61, 103113.Gallant, J. L. , Braun, J. , & Van Essen, D. C. (1993, January). Selectivity for polar, hyperbolic, and Cartesian gratings in macaque visual cortex. Science, 259, 100103. Geisler, W. S. , & Albrecht, D. G. (1995). Bayesian depth psychology of identification in monkey visual cortex Nonlinear mechanisms and stimulus certainty. Vision Research, 35, 27232730. Geisler, W. S. , & Albrecht, D. G. (1997). Visual cortex neurons in monkeys and cats Detection, discrimination and identification. Visual Neuroscience, 14, 897919. Geisler, W. S. , & Chou, K. (1995). Separation of low-level and high-level fac
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.