Deleting a few thousand records defined by such a simple filter should be easy. But It is impossible without resorting to complex API-call loops.
The obvious thing would be to use a Rule to delete old records, but this does not work because:
Rules also suffer from the same timeout issues.
There is no way to use LIMIT with Rules (unless in a script)
Rules only run max once per hour - not enough if each API call is limited to deleting 50 records and there are tens of thousands to delete.
I have resorted to looping a local script to repeatedly send a graphQL query to delete 50-70 records at a time - each query can take 20-60 seconds to finish, or it might just return an error, with no indication of what (if anything) if actually accomplished.
Is this really the best we can do in a modern “no code” platform?
It should not take days to delete a few thousand records.
I can’t see an unanswered chat from you, but I can see a chat where we are still digging into the problem. I expect @Kseniya_Piotuh will get back to you when there is news.
It certainly sounds like there is something funny going on, beyond what should be expected for a large database.
I’m also keen to understand the architectural decisions behind Fibery. When a system like fibery struggles with rendering pages or performing tasks already with 100k records, its a serious scalability or performance optimization issue.
Specifically, how have you designed and optimized the system to efficiently handle large datasets and address potential performance bottlenecks? For example:
Handling performance slowdowns associated with data growth.
Addressing physical storage limitations.
Managing resource-intensive complex queries.
Data Management Strategies:
Techniques to optimize database queries.
Implementation of connection pools.
Use of database indexing and caching.
Approaches to avoid complex joins and reduce the number of database queries.
Methods for data compression to reduce storage space and improve retrieval times.
User Experience and Data Retrieval:
Use of pagination to manage data display.
Implementation of lazy loading for on-demand data retrieval.
Vertical scaling strategies to enhance a single server’s capabilities.
Horizontal scaling methods, including the use of replication and sharding.
How you handle query routing in sharded systems.
Ensuring data consistency in replicated databases.
Managing the increased administrative tasks associated with scaling.
Infrastructure and Hardware:
Strategies for hardware and infrastructure scaling, including cloud solutions.
Techniques for archiving old, infrequently accessed data.
Techniques to pinpoint performance bottlenecks.
Strategies to ensure optimal resource usage.
Thank you in advance for shedding light on these aspects.
I don’t think you can expect to get an in-depth answer to this.
Fibery has a team of experienced developers, and the system has been designed with various performance criteria in mind, including with large datasets.
We are considering making some constraints/limitations explicit, so that users can set their expectations at the correct level.
We are actively reviewing the case that @Matt_Blais has raised.
Thanks for feedback. Regarding the issue related to GraphQL upstreaming errors, I would like to notify you that we are working on functionality allowing to start and monitor background jobs for mentioned tasks. Looks like it is an only way in the nearest future to execute long running processes without failures. Will keep you informed about that. Sorry for inconvenience.
We implemented the way to execute GraphQL mutations as background job to avoid timeouts.
You can find documentation here: Fibery GraphQL API
Please let me know how it goes if you will have a chance to try.
I know, it doesn’t solve all of your issues, but we don’t give up and will proceed with automations performance tuning to improve things there.
This is what I need to do to clean up and delete older entities:
I have ~586k “Call” records which are linked to ~35k “Call Stats” records.
I will use a long-running graphQL call to unlink 450k Call records from Stats.
Every unlink triggers a Stats Rule which adds the unlinked Call’s values to the Stats.
When I do this manually, I can see the Stats values increasing as the Rule processes through all the unlinked Calls (about 10-20 per second).
I have no problem waiting a long time for everything to finish, but this will create a very deep queue of unlinked Calls for the Rule to process, and if the Rules do not all eventually finish successfully, that would be a problem (data inconsistency).
Should I wait to start this process? Or do you believe the system can currently handle this scenario correctly?