SwiftUI, a quicker way of doing things

You may have noticed iPhones don't exactly look the same… A lot has changed internally, User Interface (UI) components look far different than they used to.

The code to create these views has evolved, and as a result, so have the UI components themselves. 

Swift, Apple's programming language for creating native iOS apps, has used UIKit as its framework for UI components since 2008. Over time, UIKit has evolved into a robust and flexible framework. 

In 2019, SwiftUI emerged as a new and faster coding framework to use. However, at the time, developers were not quick to jump on board as SwiftUI was simply not ready. There were no online discussions on using SwiftUI when it first came out, there wasn’t as much documentation, and it still needed time for more UI components to come in. 5 years later, this is not quite the case anymore.

As Apple plans for SwiftUI to slowly mature and overtake UIKit as the primary framework for iOS Apps, new native components will only continue to evolve. As a result, UI components in iOS devices and the iOS user experience will evolve too. 

In this blog, we will explore the benefits of making the transition from UIKit to SwiftUI to code views in your project, while also keeping in mind the limitations and pain points you may run into


Benefits of SwiftUI vs UIKit - A quick practical demo 

The benefits SwiftUI offers us largely come from its easy-to-read and code, declarative syntax. Rather than explaining how it's different from UIKit, I think it might be easier to just show you. 

Contacts List App Example

Below I have created two simple apps showing a contacts list, one in UIKit and one in SwiftUI. 

UIKit - Difficult to understand 

As you can see below (figure 3), writing in UIKit can be nuanced and difficult to understand to the untrained eye.

For this app in UIKit, I had to create both a Storyboard (a place where I can drag and drop objects to create a view) and a viewController to control the data. This is two files and 63 lines.

SwiftUI - Easy to understand and intuitive to write

SwiftUI in comparison is far easier to write and read, even to an untrained eye. It only required one file, containing 41 lines. I was able to show the code below to a friend without any coding knowledge, and they were able to understand how the app shown on the right was made through the code on the left. 

There is a List, separated by two sections, each containing the names of either contacts or fav contacts. 

Another benefit shown below is SwiftUI Live Previews. You can get a live interactive preview of the view which will change in real-time with any edits you make. This is better than UIKit where you would have to run your program on a simulator every time to see your changes.

Summary of benefits 

  • The declarative syntax for views is much easier to write and understand

  • Reduces the amount of code you have to write to create views 

  • Removing older views and legacy code 

  • We are keeping up to date with the latest UI components available 

  • SwiftUI Live Previews


Challenges and considerations

Challenge 1: SwiftUI being the Newcomer 

SwiftUI hasn’t been around as long as UIKit so there are still some components that may not be as robust or malleable. UIKit is relatively more stable and has more UI components available than SwiftUI. There is also much more online support [1] and documentation for UIKit.

SwiftUI has now been around for 5 years, which is enough time for good documentation to be put through, and all the initial bugs to be dealt with. Also, I believe it is a fair tradeoff, the majority of components are now available and newer ones with more functionality have been put in

You can think of SwiftUI as an automatic car, and UIKit as a manual car. You have more control over UIKit however it may take quite a bit more learning time to get the hang of. 

SwiftUI being the automatic car handles a lot of nuances that we would have to worry about earlier. In the example used earlier, we used to dequeue cells in tableViews manually (figure 3 - line 51). This is to reuse cells which reduces new cell objects being created, which means we are using less memory and the table has a better performance when scrolling down a huge table. In figure 4 we use List where we don’t have to worry about that at all! A great improvement in SwiftUI is that Lists reuse cells automatically, unlike tableViews in UIKit.

If there are some limitations with SwiftUI that only UIKit can solve, the good news is that we can integrate UIKit and SwiftUI views in our projects alongside each other. This leads us to our next challenge…

Challenge 2: Integrating SwiftUI Views in an existing project

Making this transition isn’t as easy as dumping your project and creating a new one. It is a slow and gradual change where there should be quite a bit of overlap. 

The key to this is to manage your architecture well, to make sure the new code is neatly arranged in new folders or modules, while also marking old code and files that you plan on removing. In our Kogan iOS project, we created several modules, which created a better separation of concerns, improved testability, and was able to make each module more reliable and robust. 

SwiftUI is simply a framework to create views, a lot of the remaining code such as in coordinators to help navigation or viewModels to help with data to feed into views, can still be used from your previous code.

Navigation in a SwiftUI app is a bit different from a UIKit app. In SwiftUI you simply wrap your views in a NavigationView, and can use NavigationLinks to take you to another view (reference [5] explains this wonderfully). However, for those of you who are transitioning from UIKit, this shouldn't concern you. 

In UIKit you have to create navigationControllers to manage which viewControllers are being used and navigated around. If you are transitioning, you can continue using navigationControllers as you do in UIKit, for your SwiftUI views. You simply need to wrap your SwiftUI view in a UIHostingController (see Figure 6). 

Another myth I’ve seen online is that it is difficult to have SwiftUI Views in a predominantly UIKit-led project. This is not true at all, its super easy to do, you can even have fail-safes to return to showing your old UIKit Views. Take a look at the next page!

Navigating from a UIKit view to a SwiftUI view was as easy as the code below.

At Kogan, we were very cautious during our transition. One of the ways we did this was by utilising Feature Flags, so that when turned on, the new SwiftUI view will display, and when turned off, the old UIKit view will display. This can be done in conjunction with Firebase Remote Config, so that you can instantly switch what view users see, instead of having to make a patch release if something goes wrong. 


Conclusion 

SwiftUI is ready for any team pondering the jump. It’s a great opportunity to clean your project. You will slowly be removing legacy code while introducing the latest bits of UI to your App. Once you have made the initial hurdle of replacing your views with SwiftUI, the development will be much faster, and the code will be much easier to understand! 

The key technical issue that you will face when going from UIKit to SwiftUI is using a framework that hasn’t had its time to mature compared to UIKit. I believe that 5 years have been enough, and SwiftUI components can offer us faster coding with how easy it is to declare views but also automatically handle previous pain points such as dequeuing cells in a tableView. 

Another key technical issue is creating a hybrid application while we transition. This can be slow, but it is possible to manage and work around it as we have done here at Kogan, and you will only get faster.


From Database to Domain: Elevating Software Development with DDD Introduction

In the complex landscape of software development, aligning design methodologies with business needs is crucial. Domain-Driven Design (DDD) emerges as a key approach in addressing this alignment, especially in projects characterized by intricate business rules and processes. This methodology stands in contrast to traditional practices, such as embedding business logic within databases, offering a more adaptable and business-focused perspective.

Section 1: Understanding Domain-Driven Design

Definition and Focus DDD is centered around developing software that intricately reflects the business models it aims to serve. It emphasizes a deep understanding of the business domain, ensuring that the software development process is driven by this knowledge, thereby facilitating a common language between developers and business stakeholders.

History and Evolution Pioneered by Eric Evans, DDD has grown from a set of principles into a comprehensive approach, widely recognized for its ability to tackle complex business challenges through software.

Aligning Design with Business Needs The essence of DDD lies in its focus on business-relevant software development, a principle that aligns closely with the need for software to be adaptable and directly linked to business objectives.

Section 2: Core Concepts of Domain-Driven Design

In DDD, concepts like Entities, Value Objects, Aggregates, Domain Events, Repositories, and Bounded Contexts form the foundation of a robust domain model.

Entities and Value Objects: Entities are defined by their identity, playing a crucial role in maintaining business continuity, while Value Objects add depth and integrity to the domain model.

Domain Model vs Database-Level Logic

The decision to embed business logic in the domain model rather than in the database is pivotal. Traditional database-centric approaches can lead to scalability challenges and obscure the business logic from the development team. A domain-centric approach, as proposed by DDD, enhances clarity, flexibility, and testability, a significant shift from database-heavy methodologies.

Impact of Database Logic vs. Domain Model Flexibility

A critical aspect to consider in software architecture is how changes in business logic affect different layers of the application, particularly the database and the user interface (UI). Traditional approaches that embed business logic in the database often lead to a rigid structure where changes in the database logic can cascade up, impacting the UI layer significantly. This rigidity can result in a cumbersome and time-consuming process for implementing changes, especially when business requirements evolve frequently.

In contrast, Domain-Driven Design offers a more flexible approach. By encapsulating business logic within the domain model rather than the database, DDD allows for independent management of how data is formatted and handled at different levels of the application. This separation of concerns means that:

  • Changes at the Database Level: Alterations in the database schema or logic can be absorbed by the domain model without necessarily impacting the UI. The domain model acts as a buffer, allowing for adaptations in the data representation without requiring changes in the user interface.

  • UI Flexibility: The UI can evolve independently of the database structure. The domain model can format and present data in ways that are most suitable for user interaction, irrespective of how that data is stored or processed in the backend.

  • Bi-directional Adaptability: The domain model offers flexibility in both directions – it can adapt to changes in the database while also accommodating different requirements or formats needed by the UI. This adaptability is key in modern applications where user experience is paramount and business requirements are ever-changing.

By adopting a domain-centric approach as advocated by DDD, applications become more resilient to change and more aligned with agile development practices. This flexibility is a significant advantage in today’s fast-paced and user-centric software development environment.

As we have seen, the core concepts of DDD – Entities, Value Objects, Aggregates, Domain Events, Repositories, and Bounded Contexts – are essential in crafting a domain model that is both robust and flexible. This domain-centric approach, emphasizing clarity and adaptability, marks a significant shift from traditional database-heavy methodologies. The impact of this shift is profound, not only in the architecture of the software but also in the way it aligns with and supports business objectives. Now, let’s explore how these theoretical concepts translate into real-world benefits, further underlining the value of DDD in modern software development.

Section 3: Benefits of Implementing DDD

The implementation of Domain-Driven Design goes beyond just shaping a technical framework; it brings several key advantages that enhance both the development process and the final software product. These benefits, stemming directly from the principles and concepts discussed earlier, include enhanced communication across teams, improved software quality, and greater scalability and maintainability. In this section, we will delve into each of these benefits in more detail, showcasing how the foundational principles of DDD contribute to effective and efficient software development.

  • Enhanced Communication: The adoption of a ubiquitous language and a shared understanding of the domain model bridges the gap between technical teams and business stakeholders. This common ground improves collaboration and ensures that business requirements are accurately translated into technical solutions. In practice, this leads to fewer misunderstandings, more efficient development cycles, and solutions that better meet business needs.

  • Improved Software Quality: By focusing deeply on the domain, developers create solutions that are more aligned with the actual business problems they are meant to solve. This alignment results in higher-quality software that is not only functional but also robust and resilient to changing business requirements. Furthermore, the modular nature of DDD allows for more targeted testing and quality assurance, leading to more reliable and maintainable code.

  • Scalability and Maintainability: DDD's emphasis on a well-structured domain model facilitates the creation of software architectures that are easier to scale and maintain over time. This is especially beneficial in complex systems where changes are frequent and scalability is a concern. The clear separation of concerns and bounded contexts within DDD makes it easier to isolate and address specific areas of the system without impacting the whole, thereby enhancing maintainability.

Section 4: Real-World Applications of DDD

DDD has been successfully applied across various industries, from finance to e-commerce, demonstrating its versatility in aligning software solutions with complex business needs.

Rebuilding guardian.co.uk with DDD

Custom House development team applied DDD and used value objects 

Conclusion

Domain-Driven Design represents a sophisticated methodology for aligning software development with business complexities. It offers a stark contrast to traditional database-centric approaches, advocating for a more agile and business-focused development process. DDD not only addresses technical requirements but also closely aligns with business objectives, making it an essential approach in modern software development.

Are you intrigued by the possibilities of Domain-Driven Design? If you're looking to delve deeper into DDD or considering a shift from traditional database-centric models to a more domain-focused approach, we invite you to join the conversation. Embrace the complexities of modern software development by exploring DDD, a methodology that can unlock unprecedented levels of adaptability and alignment with business goals in your projects. Dive into the world of DDD to discover how it can transform your approach to software development and help you navigate the intricate landscape of business needs and technical challenges. Stay informed, stay ahead, and let DDD guide your journey in the evolving world of software innovation.

Additional Resources

Making the case for the WebView

Learning to embrace a hybrid approach for mobile app development:

Native apps are best!

Like the rest of the native mobile app development community, I typically agree with the notion that “native is best” when it comes to mobile apps. After all, these are the technologies we spend tens of hours every week utilising, and there is a passion for user experience that I feel is required in order to happily dive into the deep-end-specialisation of Google or Apple’s tooling.

However...

As one of these oft-opinionated app developers who tends to view non-native tooling like React Native as a sub-par user experience, I have a potentially unpopular idea to share.

Sometimes, a nested WebView does have its place.

The pitch

It might not be popular with the purists, but please hear me out. The nested WebView does sometimes have an important role to play (emphasis on nested!).

When facing the challenge of balancing new features, requested enhancements, required platform changes and general maintenance, maintaining the required degree of parity between your mobile app and its accompanying website can be difficult.

In my experience, here are some key factors to consider when deciding whether to implement a flow natively within your app or whether to utilise a nested WebView: Is it a critical piece of functionality that users will frequently use? To what degree is the content static or interactive? Will it be subject to frequent changes? How much content is there to display? Will the overhead of loading a WebView (including any relevant JavaScript) be fast enough for the use case? Do you feel any technical hurdles of developing it natively will result in a better user experience?

There is no definitive flow chart that can help make this decision for you. However, when balancing time and priorities, it can make sense to utilise the hard work of your fellow web engineers. This frees up your mobile team to focus on nailing the quality of the key parts of your app that users interact with the most.

Hiding the seams

A key part of making the experience great for your users is to make the transition from native flows to WebViews as seamless as possible.

Some things to consider:

  • Nest any web views to keep the user inside your application, keeping the framing and navigation components of your app’s native experience.
  • Where there is navigation, bring the user back to your native experience wherever possible.
  • Maintain the user’s state, such as login details e.g. utilising HTTP cookies.
  • Limit the user’s ability to navigate away from the intended page (e.g. by hiding the web page’s headers/footers).
  • Persist light/dark mode theming if possible.
  • Look to combine native and WebViews within the one screen where appropriate
  • You don’t necessarily need to make the whole screen a WebView to take advantage of what they offer!
  • Leverage a tool like Firebase Remote Config to supply required configurations to your app, so you can update things retroactively if required.

An example of an embedded WebView within the Kogan.com Android application. The user can navigate beyond this page but only where intended.

An example of bringing the user back to your native experience from within a WebView.

We render the product description inside a WebView, embedded within a native XML Android layout. This means we get to utilise native tooling where we can, and outsource frequently-updated content with complex formatting.

An example of an embedded WebView within the Kogan.com Android application. The user can navigate beyond this page but only where intended.

We render the product description inside a WebView, embedded within a native XML Android layout. This means we get to utilise native tooling where we can, and outsource frequently-updated content with complex formatting.

We render the product description inside a WebView, embedded within a native XML Android layout. This means we get to utilise native tooling where we can, and outsource frequently-updated content with complex formatting.

It’s all about balance

Like a lot of things in life, it’s all about balance! What makes mobile apps special are the small details of native layouts, the gestures, the animations, but those take time, and time is precious in a development world of many priorities to juggle.

I encourage anyone who works in the space to consider where and how you can leverage the hard work of your web friends to get relevant functionality in the hands of users with a little less stress, giving your team more time to focus on the most important parts of your app 😀

Decreasing CI Build times up to 50% by caching derived data using github actions.

We had a problem. Our CI pipeline was increasingly becoming a bottleneck in our iOS continuous integration. We here at Kogan like to develop at a fast pace, however we were constantly being held up waiting for builds to complete, leading to a lot of frustration within the team. The rest of the engineering team had switched to using Github Actions(GHA), and with us still using CircleCI, it was time for us to make the change. This was the perfect time for us to re-evaluate how our pipeline was working, to ensure it was the most efficient that it can be. With a build time of over 30 minutes currently, there was a lot of room for improvement.

As we were making a switch, we brainstormed some ideas of ways to improve the overall efficiency of our pipeline, and we kept returning to Derived data. This is how Apple handles build caching within Xcode, but could we use this within our CI Pipeline? Through a bit of investigation, it turns out we weren’t the first to have this thought, and we used this blog post (https://michalzaborowski.medium.com/circleci-60-faster-builds-use-xcode-deriveddata-for-caching-96fb9a58930) as a base for our improvement.

So, where to begin? We first started by replicating our current CI pipeline to GHA, which was pretty smooth other than a few challenges trying to access some of our private repositories. Our build times with this switch had slightly improved, but were still regularly more than 30 minutes to complete. We were already caching the swift packages we use in the project, however there was still plenty of room for improvement.

First we need to ensure that we have fetched the latest changes to the repository, which can be done simply by using the option fetch-depth: 0 on the checkout action in our existing initial step.

  • uses: actions/checkout@v4 with: token: ${{ secrets.GITHUB_TOKEN }} fetch-depth: 0

We then need to cache the derived data. We need to do this in two parts - caching the derived data when a pull request has been successfully merged, and then also restoring the latest derived data cache to the CI pipeline whenever a pull request is opened.

In order to identify the latest develop commit, we use the GHA marketplace action which finds and creates a variable to be used for the latest develop commit SHA.

  • name: Create variable for the nearest develop commit SHA uses: nrwl/nx-set-shas@v3 with: main-branch-name: 'develop'

Then, we need to create a separate pipeline, which will be used to save the derived data whenever a pull request is successfully saved to develop. This will be a very similar flow to our original however the difference will be that we save the cache at the end like the below. This will cache the tmp/derived-data file (which we have set to be the location of derived data in fastlane) to be stored against the latest develop commit SHA.

  • uses: actions/cache/save@v3 name: Save Derived Data Cache with: path: tmp/derived-data key: v1-derived-data-cache-${{ steps.setSHAs.outputs.head }}

Next we need to get the correct cached derived data in our CI pipeline for pull requests. We need to again use the latest develop commit SHA to find the correct derived data cache. We use the restore version of the same action used above in order to find the right cache. This will either find a cache with an exact match, or it will fall back and use the most recent derived data with a partial match.

  • uses: actions/cache/restore@v3 name: Restore Derived Data Cache with: path: tmp/derived-data key: | v1-derived-data-cache-${{ steps.setSHAs.outputs.head }} v1-derived-data-cache-

Similar to the mentioned blog post, GHA will also set the last modified time to be the time that the file was cloned. As Xcode is using this time, we need to update this in order to take advantage of the derived data caching. We managed to find a GHA marketplace action which allowed us to do this.

  • name: Update mtime for incremental builds uses: chetan/git-restore-mtime-action@v2

Last but not least, we need to set the IgnoreFileSystemDeviceInodeChanges=YES in order to ensure Xcode does not consider our cached derived data to be out of date.

  • name: Set IgnoreFileSystemDeviceInodeChanges flag run: defaults write com.apple.dt.XCBuild IgnoreFileSystemDeviceInodeChanges -bool YES

Now that is all complete, we have successfully sped up our CI Pipelines, and decreased our build times by up to 50%. Before we started with CircleCI we were regularly exceeding 30 mins, and after caching derived data and switching to GHA, we got our builds down to roughly 15 mins. This is a massive improvement and has definitely made us developers much happier!

Looking forward we do want to carry on improving our pipeline, and are always looking for ways to keep it up to date, and as fast as it can be. One problem we have encountered using this caching method is that there is no way to clear cache fully and force the build to run without the cached data in case of any build problems other than manually deleting each cache individually. This can be time consuming so we would like to investigate this further and try to find ways to mitigate this.

Project Spotlight: Optimizing ChannelAdvisor Integration: Real-Time Product Catalog Synchronization

Introduction:

In today's e-commerce landscape, seamless integration with third-party platforms is essential for expanding reach and boosting sales. This technical blog post delves into Kogan.com's ChannelAdvisor integration project, offering insights to software engineers on event-driven architectures, infrastructure automation, and efficient CI/CD practices.

Understanding the Challenge:

Integrating with ChannelAdvisor posed a significant hurdle. As a renowned e-commerce platform bridging various marketplaces like eBay and Amazon, ChannelAdvisor promised wider customer reach for Kogan.com. However, maintaining real-time accuracy for stock and pricing data was crucial to prevent out-of-stock purchases and maintain a positive customer experience. The challenge was to ensure timely updates without burdening our production database.

Leveraging BigQuery and Event-Driven Architecture:

To address synchronization challenges, we harnessed Google BigQuery for efficient management of product information. While BigQuery's response time was slightly slower, its capacity to handle extensive catalogs proved fitting. Recognizing that BigQuery's data wasn't frequently updated, we embraced an event-driven architecture through Amazon's EventBridge. This framework generated stock and pricing update events upon changes, which were processed by Lambda functions to update ChannelAdvisor. This solution guaranteed accurate and timely stock and pricing information, enhancing customer experience.

Exploring New Technologies: Terraform and CI/CD Automation:

Within the ChannelAdvisor integration, we ventured beyond our usual tech stack to explore Terraform, an infrastructure provisioning tool. Incorporating Terraform into our UAT and Production environments allowed us to define infrastructure as code, ensuring consistent and reproducible deployments. Integration with GitHub Actions streamlined deployment by triggering Terraform through automated testing and linting checks. Furthermore, Terraform was configured to establish a dedicated test environment for each pull request, facilitating UAT and accelerating iteration cycles. This automation led to faster, reliable deployments and efficient development.

Final Outcome:

The ChannelAdvisor integration unveiled the complexities of synchronizing a vast product catalog in real time. Through Google BigQuery, event-driven architecture via Amazon's EventBridge and Lambda functions, we maintained accuracy in stock and pricing updates. Our exploration of Terraform and CI/CD practices introduced efficiency in infrastructure automation and deployment. Ultimately, this project demonstrates how innovative solutions and streamlined practices optimize e-commerce operations, elevating the customer experience.

Project Spotlight: Optimise Product Updates/ Creates on pipelines

Challenge - Our platform which houses millions of products were subject to regular updates by Marketplace sellers via APIs or by Kogan.com's internal teams. However, an arduous average update pipeline duration of 8 hours hindered the prompt visibility of these changes to customers on the website or app.

Determined to tackle this challenge, our focus rested on expediting the update process while concurrently reducing infrastructure costs so that our customers can swiftly access the most recent product information.

An analysis of the update pipeline revealed several areas requiring our immediate attention:

  1. Repetitive Update Events - recurrent API calls made by Marketplace sellers resulted in duplicated update events for the same product. This redundancy significantly impeded operational efficiency.
  2. Sluggish Database Queries - inefficiencies in database queries engendered prolonged processing times, undermining the timely dissemination of updated product information.
  3. Architectural Bottlenecks - restrictive architectural elements posed as bottlenecks, impinging on system throughput and curtailing overall performance.

To overcome these hurdles, the following measures were taken:

  1. Enhanced Diffing Logic - by leveraging an opensearch cluster, we revolutionized our diffing logic. This sophisticated cluster facilitated real-time comparison of incoming API payloads. Consequently, if no alterations were detected in the product information, redundant requests were skipped, culminating in a more expeditious system.
  2. Optimized Database Queries - a Thorough optimization of database queries was undertaken, incorporating prefetching techniques and mitigating the prevalence of joins and n+1 queries. This meticulous overhaul addressed the root causes of sluggishness, resulting in expedited processing times.
  3. Introduction of a Datastore - a revolutionary datastore was introduced, specifically designed to house denormalized product data. This datastore formed the bedrock of product information display to customers, effectively reducing dependence on Opensearch and enabling its gradual scaling down. The introduction of this refined infrastructure bolstered system responsiveness and agility.

By implementing these measures, we were able to address inefficiencies on our product updates in a timely manner. Our actions played a big part in providing our customers a more seamless experience on our site! This also enables us to scale our operations so we can support a wider range of products and continue meeting the needs of our customers in the future.

5 Simple Tips For Optimising Query Performance in BigQuery

Optimizing Query performance in BigQuery presents large cost saving opportunities for the Engineering Team as well as a speed boost which benefits the business, stakeholders and developers alike.

Following on from our meeting with the Google team, I was inspired to do some further learning and challenge myself to write a short blog post with my top 5 simple tips for writing better queries.

This is by no means an exhaustive list, or even the best tips. My hope is that this blog post will inspire you (and me) to do further research and practice.

Rather than provide example code which is useful but boring, I've gone with pictures instead which are less useful but also way less boring.

I will also provide a couple links at the end of the post for those who want to learn more.

Tip 1. Avoid SELECT *

OK I admit it, I have been guilty of using SELECT * for massive tables believing that applying the LIMIT clause will reduce bytes processes. I was wrong.

Best Practise

  • Want to explore the data? Use the data preview function instead of SELECT *
  • Query only the columns you need.

    • eg SELECT dim1, metric1 FROM … or
    • SELECT * EXCEPT dim2, metric2 FROM
  • If you do require every column, consider:

    • Materialising results in a destination table and querying that table instead.
    • Partitioning your tables and querying the relevant partition.

Tip 2. Table JOIN Order Matters

The aim here is to reduce the dataset as early as possible. I thought that BQ optimized joins automatically, but apparently this is not always the case.

Best Practise

  • Manually place the largest table first, followed by the smallest, then by decreasing size.
    • You want to reduce your resultset the most at the first join.

Tip 3. Know when to aggregate early and late.

Aggregation is very costly so care needs to be taken.

Best Practise

  • As a general rule aggregate and perform complex operations as late and as seldom as possible.
  • UNLESS a CTE (Common Table Expression) can be reduced drastically by aggregating before joining, then aggregate early! You want to prune data as much as possible before a join.

Tip 4. When using WHERE clauses AND…, put the most selective first.

Again, the point is to reduce the data as early as possible.

Best Practise

  • Order expressions in your WHERE clause with the most selective expression first. (i.e. the one which will eliminate the most rows)
  • Then: Use expensive expressions like LIKE ‘%text%’ only after you have reduced your dataset

Tip 5. Use Approximate Functions where possible.

I had no idea these even existed. If you don’t need precision >99% approximate functions yield faster query performance.

Best Practise

  • Where appropriate, use approximate functions. (e.g. in data exploration)
    • Eg use APPROX_COUNT_DISTINCT(dim1) instead of COUNT (DISTINCT dim1)

Tip 5. Use Approximate Functions where possible.

I had no idea these even existed. If you don’t need precision >99% approximate functions yield faster query performance.

Best Practise

  • Where appropriate, use approximate functions. (e.g. in data exploration)
    • Eg use APPROX_COUNT_DISTINCT(dim1) instead of COUNT (DISTINCT dim1)

Hack Day - Chat (Khat) GPT

Why did the AI buying assistant start a stand-up comedy career? Because it knows how to find the best "deals" of laughter!

Here is hoping ChatGPT is better at being a buying assistant than it is at writing jokes!

ChatGPT is the hottest topic around the industry and, inspired by a recent tech talk by our CTO, we used our most recent hack day to try and answer some questions about what Kogan.com could use ChatGPT for. Can we train ChatGPT with knowledge of our products so that it can be a buying assistant to our customers? Can we train it with our FAQs and customer support information so that it can help answer customer queries more efficiently? Through a few different experiments, we tried to answer these questions.

So what is ChatGPT?

ChatGPT is an advanced artificial intelligence language model developed by OpenAI that specialises in understanding and generating human-like text. It works by analysing large amounts of text data and predicting the most likely responses or content based on the context provided, leading to highly accurate and coherent interactions.

Where to begin?

The ChatGPT team on our hack day was the largest and involved a good mixture of different types of developers. We split into two main segments - our backend developers focused on training the model with our data and creating an API that could be interacted with. Our mobile developers focused on adding a chat feature to our current app, which would interact with the created API.

Implementation

Training the model

Starting out we investigated which language model would be the best fit for our application. Well-known models like GPT 3.5/4 are best suited for precise answers whereas models like Davinci are better at extended, concise responses. As this was a support-type bot with a general audience, we went with Davinci for a better user experience.

The process of training the bot involved creating a GPT vector index containing all our data using the OpenAI language model we’ve selected. We used the Python lib, LlamaIndex to facilitate this process. We were then able to query our GPT vector index with our questions.

We trained one of the models using product descriptions and product manuals and the second included customer support FAQs and a collection of fake customer support tickets to see if we could train positive interactions into the model.

So, with our models in hand, we jumped in with some questions!!

Immediately we found that general information from our customer interactions with the support team, such as customer name, was being used by the model to assume the user’s name etc. Questions like,

“How can I get a refund on my order”,

Would often come back with responses like,

“Sure , to receive a refund from Kogan…”

The security/privacy implications of a model being trained with conversations were becoming obvious and seemed like a bigger problem for another day, so we had to park the customer support branch of this project for the day.

We focused on the product assistant for the rest of the day, which turned out to be a great decision as we got some great results.

API Setup

In addition to building the model, we also needed to host it somewhere to make it available for the mobile app to consume. Initially, we started building out a lambda function behind an API gateway in AWS to host the application, but in the final hour of our hack day realised that the model wouldn’t fit into the lambda function’s available space, so we quickly pivoted by spinning up an EC2 instance using AWS Lightsail and setting up Flask as our API. It was great to see the team mitigate issues as they popped up like this.

Solution

At the end of the day, the team was able to showcase a working prototype, with our mobile app integrating with the API and returning some interesting responses. The most promising response we got was a question related to safety information contained within one of our product guides.

Q: “Can I use the Estelle Hair dryer in the bathroom?”

A: “No, you should not use the Estelle Hair Dryer in the bathroom. According to the safety warnings, it should not be used in the bath or shower, near a bathtub, washbasin or other products that contain water, and should be unplugged after use and the speed and heat settings should be set as required before use if it is being used in the bathroom.”

Another example of a general question was:

Q: “What options are available in the Kogan washing machine?”

A: ” The Kogan 7kg Top Load Washer offers a range of options, including 7kg clothes capacity, 6 wash programs and memory function, 8 water levels and 9 types of water power, LED display and delay start function, fuzzy logic control and detergent dissolve tech, child safety lock feature, and a drain pump filter to avoid blockages. It is made from high-quality, clear ABS material, making it easy to maintain and durable for everyday use. It also features personal preference settings, allowing you to customise your washing experience.”

At times, the model wouldn’t align the correct information with a given product but this was always good for a laugh.

Q: “Which Kogan toaster would best suit my family of 6”

A: “The Kogan 4 Slice Toaster (Silver) would best suit a family of 6. It features two independent browning controls, seven adjustable browning settings, built-in Defrost and Reheat functions, an ultra-sleek silver design, a weight capacity of 50kgs, and a frame material of wood. It also has a removable crumb tray at the bottom for easy cleaning.”

To improve this we could have provided our data in a more structured format, but for this experiment all our data was unformatted.

One of the main problems we discovered was the time it took to get a response from the API. Responding often took 5-10 seconds to come through and would come through as complete responses rather than each word like the ChatGPT client in your browser does. This was just a limitation of the library implementation and setting, so could be mitigated with a more bespoke solution.

As an added bonus, our mobile developers also took the time to work with SwiftUI, with one even using ChatGPT to assist with this along the way, and we were all impressed at what it was responding with.

Final Thoughts

Overall it was definitely a successful hack day for the ChatGPT team and highlighted the immense potential of the use cases of ChatGPT at Kogan.com. The hack days are a great opportunity to collaborate with other squads within the engineering team which we wouldn’t usually interact with on a daily basis.