Simultaneous Localization and Mapping (SLAM) refers to the algorithms utilized to achieve a reliable estimation of the pose of a mobile vehicle (robot) and also a geometric map of the environment through which the vehicle moves. To have a rough estimation of a vehicle's pose, usually, sensors such as gyroscopes, wheel-encoders or visual odometry systems are applied. The sensors provide instantaneous motion parameters of the vehicle which can be used along with proper motion models to estimate the pose of the vehicle at each time instance. Nevertheless, the mentioned sensors cannot provide precise motion parameters, giving rise to accumulating errors concerning the vehicle's pose. To address this problem, SLAM algorithms have been introduced to utilize extrinsic sensors to obtain extra measurements and to fuse those with other sources of data in order to limit the increasing errors and generate a reliable map. In conventional SLAM systems, extrinsic sensors are used, which deliver distance and bearing measurements between the vehicle and special features (landmarks) in an environment. The sensors for such measurements could be laser range finders or stereo cameras. Laser range finders are generally very expensive and they are mostly suited for indoor environments. Using stereo cameras necessitates extra computation loads for feature associations between two cameras and if the features are far from the camera (depending on the resolution of the camera), the estimated positions of the landmarks have large uncertainties which cannot be modeled by Gaussian models. Therefore, applying monocular cameras has attracted the attentions of researchers in the recent years. This type of SLAM is known in literature as monocular SLAM which consists of three parts. First, feature extraction and tracking within the frames.