Degree Type


Date of Award


Degree Name

Doctor of Philosophy


Mechanical Engineering


Human Computer Interaction

First Advisor

Eliot H. Winer


Augmented Reality (AR), the ability to present the real-world, overlayed with digital information, is quickly gaining popularity for a number of applications ranging from entertainment to manufacturing. Improved hardware and miniaturization along with robust sensor processing techniques have made AR a viable technology on a variety of high-end and commodity devices. The intention of any AR system is to “augment” a user’s reality, with digital content, in order to aid him/her in a particular task. This task can vary from analyzing a factory layout in its intended physical space to following directions on your windshield while driving. Surgeons use AR technology to locate tumors on the fly during operations. The entertainment industry has used AR from hand-held games to major attractions. It has been used in all forms of product development from design to assembly to maintenance to guide users through complex processes. Particularly in manufacturing, these AR guided assembly viewers have been developed and found to benefit end-users considerably. It has reduced the cognitive load on the user, reduced assembly times, increased quality, and reduced training time. AR guided assembly viewers have also, in some cases, removed the advantage that experts have over non-experts.

However, authoring the content within an AR guided assembly viewer is the biggest obstacle to its widespread adoption. Issues hindering authoring include: 1) inadequate software skillset of the person responsible for authoring, 2) complex hardware and software systems, and 3) the process being time and resource consuming. To mitigate the various issues in authoring AR guided assembly content, researchers have proposed a number of solutions without much success. Solutions include: 1) using markers placed in the real environment as reference for AR systems to reflect the necessary content, 2) creating content solely in a virtual environment, 3) performing gestures that facilitate content creation, and 4) using sensors (e.g., GPS) to locate the view of the user and overlay content.

Based on a literature review conducted of the numerous solutions available, various gaps were identified in the context of AR guided assembly and formed into three research issues. They are: 1) establishing automatic registration of parts during an expert demonstration, 2) ensuring sufficient domain level expertise is incorporated into AR guided assembly instructions, and 3) evaluating the instructions generated by an authoring tool for accuracy and quality.

To direct the creation of new AR guided assembly authoring methods that address the aforementioned research issues a classification technique, established by Hampshire was used (also known as Hampshire classification). Along with the Hampshire classification, the various factors necessary to successfully author AR content are categorized according to their specific environment, interaction metaphors, and author’s domain knowledge. All of these were used to support the design choices made in the proposed authoring technique known as expert demonstration.

This technique automatically generates the various assembly steps and part movements by analyzing 3D point cloud data generated from an expert who performs the assembly (i.e. “demonstrates” it). Through this demonstration, a corresponding skin model is generated and used to detect and remove the expert’s hands in the captured frames. Image analysis is also performed to remove the scene background as well as identify and track each part in the assembly. The analysis results in the detection of each individual assembly step, each part location, and each assembly path. This information is used to generate AR guided assembly work instructions.

Skin detection is a critical component of the expert demonstration authoring technique, as without filtering out the hands detecting, identifying and tracking the parts becomes more difficult. Current skin detection techniques like Gaussian mixture models require clean data and are unable to handle noisy data without overfitting the model. This led to the development of a novel algorithm called particle swarm optimized Gaussian mixture model (PSOGMM). PSOGMM was designed to take only a single input variable describing the amount of noise in the system. It was compared to traditional GMM with two independent datasets.

To illustrate the authoring of AR guided assembly work instructions, AREDA (Augmented Reality via Expert Demonstration Authoring), an AR authoring tool was developed. It was divided into two phases known as the demonstration phase and the refinement phase. The demonstration phase consisted of determining the various calibration parameters (background, area, and skin) as well as recording and processing the assembly demonstration. The refinement phase consisted of taking these automatically generated AR work instructions and fixing mistakes during the demonstration phase, refining orientation and positions of the 3D models, and adding other forms of information like textual instructions or images. AREDA was tested on three assemblies of increasing complexity, namely DUPLO blocks, a 3D printed grip vise, and a laptop.


Copyright Owner

Bhaskar Bhattacharya



File Format


File Size

193 pages