Multi modal compositions and AI

Multi-modal compositions from different types of sensory inputs for real time response is inherently difficult. Case and point, the auto pilot feature in Tesla cars.Originally, the suite of Autopilot sensors – which Tesla claimed would include everything needed to achieve full self-driving capability eventually – included eight cameras, a front-facing radar, and 16 ultrasonic sensors all around its vehicles. Right from the start, Tesla did not believe in including expensive LIDAR capabilities for auto driving.

Here is a summary of the progression of the characteristics:

  • AP0 – No sensors
  • AP1 – Autopilot driven by single front facing camera using Mobileye. This had subsequent updates adding sensors and a front facing radar.
  • AP2 – Coordinating the input with Mobileye proved difficult, so Tesla opted for the NVidia GPU chip instead. One of the first AI driven features was the rain intensity sensing wipers based out of the camera vision rather than use typical pressure sensors to measure the intensity of rain on the windshield.
  • AP3 – To increase processing speed, Tesla moved from NVidia to have its own dual chip GPU to process all the sensors and also have redundancy.
  • FSD1 – Hands free driving on surface streets added
  • FSD2 – Removed radar functionality
  • FSD3 – Remove Ultra sonic sensor capability
  • FSD4 – Add high fidelity radar to enable driving in inclement conditions where vision is impaired.

At Numorpho Cybernetic Systems (NUMO), our fourth tenet, the TAU Codex Orchestrator will coordinate between multi-modal inference engines to correctly predict the needs of actionable intelligence and we will be mindful of such progressions as Tesla’s to ensure that appropriateness is maintained in our depictions. For true intelligent actions however, we believe that a multi-modal approach is needed to fully ascertain all the conditions for an equitable response.

NI+IN UCHIL Founder, CEO & Technical



Leave a Reply

%d bloggers like this: